List News Background articles Commentaries Development/Java IT Security Computer Guides & Tips

Understanding CSV Files: What they are and how to use (How-to)

Added at 02/04/2023 by Frank Hissen

Using CSV files is a daily routine for many people who have to accomplish a - fully automatic or semiautomatic - data exchange between different IT systems. CSV files are not the only solution path, but are very common. This concerns for instance online and e-commerce businesses which import the current orders from eBay into their own ERP system (Enterprise Resource Planning) or the accounting when exporting tax data to the tax office.

CSV originally stands for "comma-separated values" which is a very general phrasing but fits the universal usage in practice of CSV files. Although, today the separation using commas does not happen that often anymore. However, the German tax office portal ("Elster"), for instance, uses comma-separation for transmitting VATs through CSV files.
Everyone who knows what an Excel or database table looks like also knows what a CSV file basically contains. That is exactly one table. This usually includes a header line containing column or field names (the first line). However, this is not a must in case it is clearly defined which column contains which data and which fixed order the columns have.

'Defined' in this case means that the data exchanging systems have agreed to a certain format. The basic properties of a CSV file are:

Character Encoding and Column Separator

Autodetecting encodings can be error-prone in practice, which is why when importing CSV files the encoding usually is implicitly stated or explicitly specified. The same is true for the column separator.
German Umlauts are a good example. In general, through the right encoding, e.g., UTF-8, every alphabet used in the world can be transmitted safely. However, if the wrong encoding is used when exporting and importing the CSV file, character errors occur. Like in the case of German "ä" instead of "ä" or "ß" instead of "ß". Also non-visible or non-representable control characters might occur because the wrong encoding is specified.

Since many information contains, e.g., commas, using tab stops (TAB) as separator char for CSV files mitigates processing errors in practice. TABs are usually not part of fields like personal names, street names or product information. As alternative char sequences can also be used as separator.

Import/Export: Scripting & Automatization

Although this article describes the data exchange using CSV files, it should be pointed out that fully automated data processing is usually realized using integrated interfaces like APIs, JSON or XML web services. Using these means programming integrated processes which follow common, well-defined standards. The actual data is then exchanged directly between systems (machine-to-machine), the exchange itself is not visible to the end-user.

Using files can also mean using JSON or XML files. However, these are usually generated using scripts or other custom-made components and not programs like Excel or Libre/OpenOffice. JSON and XML are also able to represent tables, but have the ability to represent hierarchic data structures as well which is a clear advantage in some cases.

The advantage of CSV files - if the format has been understood and the exchange defined - is the simplicity, regarding the handling and the structure. Desktop applications which are closely connected to the CSV format and allow for reading, editing and exporting are Microsoft Excel and LibreOffice Calc resp. OpenOffice Calc.
When data exchanges between systems become established, CSV files are usually created and read using applications (e.g., eBay, Paypal, Afterbuy etc.), database scripts or other programs. When done efficiently, the data amount and number of records etc. can be very large also when using CSV files. For instance, when the files are read and written per line.

When there is no direct export/import functionality included in the concerning systems, using access to the underlying database can generally be used to create such a function externally.

Using CSV Files: Examples

Although Excel is a very powerful spreadsheet tool, importing CSV files with different formats cleanly can be a challenge. Libre/OpenOffice Calc on the other hand includes a very simple import dialog to correctly read CSV files, edit them and also save them again into the correct format. I want to point out that Libre/OpenOffice can be smoothly installed alongside MS Office and Excel and Excel can still be used as standard application for opening Excel- and CSV files. This way, one can easily and safely try out processing CSV in Libre/OpenOffice and also try out Libre/OpenOffice in general.

In the following, I show a standard CSV file from Paypal. It is a Paypal report which contains the current orders and payments to be used for further processing in, e.g., the local ERP or accounting software. This report can be created in the report section of the Paypal site. In the example, I use a Windows 11 computer with LibreOffice installed. (The CSV has German column headings.)

Here one can see the CSV file in 'Windows Explorer'. The system used in this case has no Excel installed which is, however, unimportant for the procedure shown.

Blog - Understand CSV Files

Using the right mouse click on the file opens the context menu. The menu item "Open with..." enables using other programs to open the CSV file than the default / standard application:

Blog - Understand CSV Files

As first step, we look at the file using the Windows editor 'Notepad':

Blog - Understand CSV Files

At first glance, the visualization looks more or less structured. However, regarding the blurred personal data, one can quickly see that the data arrangement using tabs in a simple editor does not provide good readability of the file. The advantage is, one can easily make changes to the data - given no tabs are deleted:

Blog - Understand CSV Files

The same file in SciTE, which is an open-source editor, displaying all control / functional characters. Here one can clearly see that there are no simple spaces but tabs between columns:

Blog - Understand CSV Files

At the end of each line / row you can see the combination of "carriage return" and "line feed":

Blog - Understand CSV Files

Now, we open the file in LibreOffice. LibreOffice Calc detects that we read no Excel and no OpenDocument file and starts the import filter dialog. Here one can see very nicely what to set-up (especially encoding and separator) to cleanly import the CSV. Also, you can see a preview of how the file will be parsed or read using these settings:

Blog - Understand CSV Files

Once loaded and opened, the CSV file content cannot be distinguished from an Excel file. LibreOffice Calc allows now to make changes to the file and saves these changes automatically in the chosen CSV format that was used for the import. Using the "Save as..." menu item, every Excel file opened can be transformed or exported to a CSV file as well!

Blog - Understand CSV Files

Blog - Understand CSV Files

Blog - Understand CSV Files

CSV Sample File and Application Example

A sample file in Excel with the corresponding CSV like it is used in our software "ParcelConnector" you can find here for download: Desktop-ParcelConnector. For the import of CSV files we use the format "UTF-8 and TABs".

About HissenIT

HissenIT offers a portfolio of ready-to-use products and tried and tested solutions as well as custom software development. For instance, we develop Web Services or connect to them and we build interfaces and automation technologies for, e.g., Data Import/Export.
Our in-house products like InvoiceCreator and ParcelConnector offer the ability to generate invoices, delivery notes, and documents in general as well as purchasing DHL shipping labels. These products can also be used as a basis for tailor-made software.

Keywords

CSV, Files, Data Interface, Data Import, Data Export, Data Exchange, Automation, Script, Generating, Tables, Databases, APIs, Web Services, eBay, Amazon, E-commerce, ERP, Shop, Shop Systems

Categories: Background articles Development/Java Computer Guides & Tips


Comments

Post your comment

Share

If you like this page, it would be a great thing if you share it with others:

Mail Facebook Twitter Pinterest LinkedIn
reddit Digg StumbleUpon XING
WhatsApp Telegram