Log in / create account | Login with OpenID
DocForge
An Open Wiki For Software Developers

String

From DocForge

A string is an array of characters. In some languages, a string is its own object, which has many utility functions built into it.

String
IteratorsRandom-access
Big-O
LookupO(1)
List OperationsO(n)+
Front OperationsO(n)+
Back (Stack) OperationsO(1)+
+ indicates occasionally a significant extra cost is incurred

Contents

String Escape Characters [edit]

Strings are usually denoted by either a ' (apostrophe) or a " (quotation mark), closed by the same sign. To place either of these characters into the string itself, it is usually denoted using a special character syntax. That escape character is usually a "\" (backslash) or "/" (slash, forward slash, "whack" in Microsoftnese). Thus a quotation mark in a string would be /". This is also how tab (/t), newline (/n), file end (/0), and many other special characters are represented in so-called string literals, or commonly occurring strings that are literally placed in the code, as opposed to being stored in an external string database.

Special Implementations of Strings [edit]

A string's greatest drawback in some programming languages is that it is immutable, or it cannot be changed. In some languages individual characters in the string can be changed. Making a string larger or smaller requires creating a new string of the desired size since a String is nothing more than an array. Because of these drawbacks, several different alternate implementations of strings exist. In most instances normal strings will suffice, however, there does come a time when performance requires a more specialized tool.

Java's StringBuilder [edit]

Java features a StringBuilder, which is a String based on a Linked List. This sacrifices the O(1) lookup for the ability to dynamically add, remove, or insert characters at any time without the memory allocation overhead incurred by the traditional array-based String implementation.

Character Tree [edit]

A character tree isn't really a true String, since it's more of a Dictionary tool. A character tree is a b-tree of characters which can be used as a kind of "word book" to lookup the existence of a String from a trusted source. Every letter is a node, and also carries a boolean value or some form of entry to determine whether it is a valid psuedoleaf node. Remember, Aardvark and 'a' both start with the same letter, and Aard isn't a word! The boolean provides a mechanism to prematurely terminate the lookup, without needing a dedicated leaf node. This saves memory, since words which share similar prefixes share the same memory in the B-Tree. This also gives it exceptionally rapid addition, since Strings can be read in from a file on disk with very little parsing overhead. This implementation is far faster than maintaining a list of all words, since even the first letter alone saves a great deal of storage because it is not replicated in all places.