StringTokenizingReader module

Module contents

template<typename InputLineIteratorT>
class StringTokenizingReader

Iterate over a range of input strings and tokenize each one.

This is the third of four steps in the pipeline of reading points in from a file. The first is to read in a file line-by-line. The second is to filter out those lines that are comments. The third is to tokenize the lines that survive into little bits that we can then use to populate a point.

Public Types

typedef InputLineIteratorT input_line_iter_type
typedef InputLineIteratorT::value_type string_type
typedef TokenizedStringIterator iterator
typedef TokenizedStringIterator const const_iterator

Public Functions

inline StringTokenizingReader()

Initialize an empty reader with default delimiters (space, tab).

inline StringTokenizingReader(input_line_iter_type Start, input_line_iter_type Finish)

Initialize a tokenizer with an input range and default delimiters.

inline StringTokenizingReader(input_line_iter_type Start, input_line_iter_type Finish, string_type const &Delim)

Initialize a tokenizer with an input range and your own delimiters.

inline StringTokenizingReader(StringTokenizingReader const &other)

Copy state from another tokenizer.

inline virtual ~StringTokenizingReader()

Destructor.

inline void set_field_delimiter(string_type const &delim)

Set the delimiter character to use in tokenization.

The single character in the string you supply will be used as a field delimiter.

Parameters:

delim[in] Delimiter character to be set

inline string_type field_delimiter() const

Return the delimiter character currently in use.

inline void set_escape_character(string_type const &escape)

Set the escape character to use in tokenization.

You must supply a string with either 0 or 1 character to be used as an escape character. The escape character removes the special properties of whatever character follows, usually a newline, separator or quote character.

Parameters:

escape[in] Escape character to be set

inline string_type escape_character() const
Returns:

The escape characters currently in use.

inline void set_quote_character(string_type const &quote)

Set the quote character to use in tokenization.

The single character in the string you supply (assuming it is not empty) will be used as a quote character. Inside a quoted string (a string that begins and ends with the quote character), field delimiters (e.g. comma) will be ignored. Also, inside a quoted string, embedded quote characters must be escaped.

Parameters:

quote[in] Quote character to be set

inline string_type quote_character() const
Returns:

the quote characters currently in use.

inline StringTokenizingReader &operator=(StringTokenizingReader const &other)

Assign a StringTokenizingReader to the value of another.

Parameters:

other[in] StringTokenizingReader to assign value of

Returns:

Reader with the new assigned value

inline bool operator==(StringTokenizingReader const &other) const

Check whether one reader is equal to another by comparing all the properties.

Two readers are equal if all of their properties are equal.

Parameters:

other[in] StringTokenizingReader for comparison

Returns:

Boolean indicating equivalency

inline bool operator!=(StringTokenizingReader const &other) const

Check whether two StringTokenizingReader are unequal.

Parameters:

other[in] StringTokenizingReader for comparison

Returns:

Boolean indicating equivalency

inline void set_input_range(input_line_iter_type const &start, input_line_iter_type const &finish)

Set the beginning and the end of the input range

Parameters:
  • start[in] The iterator to use for the start of input

  • finish[in] The iterator to use for the end of input

inline iterator begin() const

Return an iterator to the first parsed point.

This will take the parameters you’ve established for the input stream, comment character, delimiters and field/column mapping and start up the whole parsing pipeline. You can iterate through in the standard C++ fashion until you reach the end().

Note

Any changes you make to the parser configuration will invalidate existing iterators.

Returns:

Iterator to first parsed point

inline iterator end() const

Return an iterator to detect when parsing has ended.

This iterator is guaranteed to not point at any valid TrajectoryPoint. The only time when begin() == end() will be when all points have been parsed from the input stream.

Returns:

Iterator past end of point sequence

inline const_iterator const_begin() const

Get an iterator pointing to the beginning of the stream

Returns:

Iterator pointing to current stream

inline const_iterator const_end() const

Get an iterator pointing to the end of the stream

Returns:

Iterator pointing to end of current stream

Private Types

typedef boost::escaped_list_separator<typename input_line_iter_type::value_type::value_type> separator_type
typedef boost::tokenizer<separator_type> tokenizer_type
typedef std::pair<typename tokenizer_type::iterator, typename tokenizer_type::iterator> token_iterator_pair

Private Members

input_line_iter_type InputLinesBegin
input_line_iter_type InputLinesEnd
string_type FieldDelimiter
string_type EscapeCharacter
string_type QuoteCharacter
class TokenizedStringIterator

Class for the tokenized string iterator

Public Types

using iterator_category = std::input_iterator_tag
using value_type = token_iterator_pair
using difference_type = std::ptrdiff_t
using pointer = token_iterator_pair*
using reference = token_iterator_pair&
using iterator = typename tokenizer_type::iterator

Public Functions

inline TokenizedStringIterator()

Instantiate a default TokenizedStringIterator.

inline ~TokenizedStringIterator()

Destructor.

inline TokenizedStringIterator(input_line_iter_type Begin, input_line_iter_type End, string_type const &Delim, string_type const &Escape, string_type const &Quote)

Instantiate a TokenizedStringIterator using specified properties

Parameters:
  • Begin[in] Iterator to start at

  • End[in] Iterator to end at

  • Delim[in] Character to use for delimiting

  • Escape[in] Character to use for escaping

  • Quote[in] Character to use for quoting

inline TokenizedStringIterator(TokenizedStringIterator const &other)

Copy contructor, create a TokenizedStringIterator with a copy of another

Parameters:

other[in] TokenizedStringIterator to copy from

inline TokenizedStringIterator &operator=(TokenizedStringIterator const &other)

Assign a TokenizedStringIterator to the value of another.

Parameters:

other[in] TokenizedStringIterator to assign value of

Returns:

TokenizedStringIterator with the new assigned value

inline token_iterator_pair const &operator*() const

Multiply an iterator.

Returns:

Result of the multiplication

inline token_iterator_pair const *operator->() const

Get the current iterator object.

Returns:

Current iterator

inline TokenizedStringIterator &operator++()

Advance the iterator to the next position in the sequence.

Returns:

Pointer to the next iterator in the sequence

inline TokenizedStringIterator &operator++(int)

Advance the iterator to the next position in the sequence.

Returns:

Pointer to the next iterator in the sequence

inline bool operator==(TokenizedStringIterator const &other) const

Check whether one TokenizedStringIterator is equal to another by comparing all the properties.

Two TokenizedStringIterators are equal if all of their properties are equal.

Parameters:

other[in] TokenizedStringIterator for comparison

Returns:

Boolean indicating equivalency

inline bool operator!=(TokenizedStringIterator const &other) const

Check whether two iterators are unequal.

Parameters:

other[in] Iterator for comparison

Returns:

Boolean indicating equivalency

Private Functions

inline void _tokenize_this_line()

Private Members

tokenizer_type *Tokenizer
token_iterator_pair TokenRangeCurrentString
input_line_iter_type InputLinesBegin
input_line_iter_type InputLinesEnd
string_type FieldDelimiter
string_type EscapeCharacter
string_type QuoteCharacter