Advertisement
Advertisement


CSV string handling


Question

Typical way of creating a CSV string (pseudocode):

  1. Create a CSV container object (like a StringBuilder in C#).
  2. Loop through the strings you want to add appending a comma after each one.
  3. After the loop, remove that last superfluous comma.

Code sample:

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();
    foreach (Contact c in contactList)
    {
        sb.Append(c.Name + ",");
    }

    sb.Remove(sb.Length - 1, 1);
    //sb.Replace(",", "", sb.Length - 1, 1)

    return sb.ToString();
}

I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?

I feel that there should be an easier/cleaner/more efficient way of removing that last comma. Any ideas?

2016/02/07
1
20
2/7/2016 2:30:15 PM

Accepted Answer

You could use LINQ to Objects:

string [] strings = contactList.Select(c => c.Name).ToArray();
string csv = string.Join(",", strings);

Obviously that could all be done in one line, but it's a bit clearer on two.

2010/03/26
21
3/26/2010 5:29:31 PM

Your code not really compliant with full CSV format. If you are just generating CSV from data that has no commas, leading/trailing spaces, tabs, newlines or quotes, it should be fine. However, in most real-world data-exchange scenarios, you do need the full imlementation.

For generation to proper CSV, you can use this:

public static String EncodeCsvLine(params String[] fields)
{
    StringBuilder line = new StringBuilder();

    for (int i = 0; i < fields.Length; i++)
    {
        if (i > 0)
        {
            line.Append(DelimiterChar);
        }

        String csvField = EncodeCsvField(fields[i]);
        line.Append(csvField);
    }

    return line.ToString();
}

static String EncodeCsvField(String field)
{
    StringBuilder sb = new StringBuilder();
    sb.Append(field);

    // Some fields with special characters must be embedded in double quotes
    bool embedInQuotes = false;

    // Embed in quotes to preserve leading/tralining whitespace
    if (sb.Length > 0 && 
        (sb[0] == ' ' || 
         sb[0] == '\t' ||
         sb[sb.Length-1] == ' ' || 
         sb[sb.Length-1] == '\t' ))
    {
        embedInQuotes = true;
    }

    for (int i = 0; i < sb.Length; i++)
    {
        // Embed in quotes to preserve: commas, line-breaks etc.
        if (sb[i] == DelimiterChar || 
            sb[i]=='\r' || 
            sb[i]=='\n' || 
            sb[i] == '"') 
        { 
            embedInQuotes = true;
            break;
        }
    }

    // If the field itself has quotes, they must each be represented 
    // by a pair of consecutive quotes.
    sb.Replace("\"", "\"\"");

    String rv = sb.ToString();

    if (embedInQuotes)
    {
        rv = "\"" + rv + "\"";
    }

    return rv;
}

Might not be world's most efficient code, but it has been tested. Real world sucks compared to quick sample code :)

2016/02/07

Don't forget our old friend "for". It's not as nice-looking as foreach but it has the advantage of being able to start at the second element.

public string ReturnAsCSV(ContactList contactList)
{
    if (contactList == null || contactList.Count == 0)
        return string.Empty;

    StringBuilder sb = new StringBuilder(contactList[0].Name);

    for (int i = 1; i < contactList.Count; i++)
    {
        sb.Append(",");
        sb.Append(contactList[i].Name);
    }

    return sb.ToString();
}

You could also wrap the second Append in an "if" that tests whether the Name property contains a double-quote or a comma, and if so, escape them appropriately.

2016/02/07

Why not use one of the open source CSV libraries out there?

I know it sounds like overkill for something that appears so simple, but as you can tell by the comments and code snippets, there's more than meets the eye. In addition to handling full CSV compliance, you'll eventually want to handle both reading and writing CSVs... and you may want file manipulation.

I've used Open CSV on one of my projects before (but there are plenty of others to choose from). It certainly made my life easier. ;)

2008/08/20

You could instead add the comma as the first thing inside your foreach.

if (sb.Length > 0) sb.Append(",");

2011/11/26

You could also make an array of c.Name data and use String.Join method to create your line.

public string ReturnAsCSV(ContactList contactList)
{
    List<String> tmpList = new List<string>();

    foreach (Contact c in contactList)
    {
        tmpList.Add(c.Name);
    }

    return String.Join(",", tmpList.ToArray());
}

This might not be as performant as the StringBuilder approach, but it definitely looks cleaner.

Also, you might want to consider using .CurrentCulture.TextInfo.ListSeparator instead of a hard-coded comma -- If your output is going to be imported into other applications, you might have problems with it. ListSeparator may be different across different cultures, and MS Excel at the very least, honors this setting. So:

return String.Join(
    System.Globalization.CultureInfo.CurrentCulture.TextInfo.ListSeparator,
    tmpList.ToArray());
2016/02/07

Source: https://stackoverflow.com/questions/4432
Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Email: [email protected]