Welcome to bytebang » The blog about all and nothing » Efficient CSV analysis with H2

Efficient CSV analysis with H2

Aug 04 2014

Installation of H2

If you try to analyze a huge amount of CSV data then Excel becomes pretty slow in some scenarios. This article shows how to import a csv file into a H2 in-memory database in order to perform analysis with database-power!

The installation is straight forward. Got to www.h2database.com and download the right installer for your operating system. I personally prefer the "All Platforms" zipfile because i often switch the operatingsystem. Once you have downloaded the files you have to start the database and point a webbrowser to the "Management Console" of the h2 database by entering the url http://localhost:8082

Depending on what you want to archive you can play around with the JDBC connection string or (in must cases this will be sufficient) you just press "Connect".

Import the CSV File

Lets assume the following CSV File (stored on c:\test.csv) that was exported from Excel:

source_excel.jpg

Importing a CSV File is straight forward. Just execute the following SQL statement in the previously opened database connection:

CREATE TABLE MYNEWTABLE AS SELECT * FROM CSVREAD('c:\test.csv',NULL,'charset=UTF-8 fieldSeparator=;');

This creates a new table named MYNEWTABLE with 3 columns:

result_in_H2.jpg

The (optional) parmeters 'charset=UTF-8 fieldSeparator=;' are used to align the h2 csv importer function to the output format that is created by Excel.

You are of course not limited to the UTF-8 encoding. H2 supports all encodings that are supported by your Java virtual machine. Here are the supported encodings for the oracle vm.
Another neat feature is that H2 it is able to read and write different csv flavors by specifying the correct csv options.

Further steps

Now you have all the data from the CSV in the database. A good idea is to create indizes on the fields that you are gonna ues as search criteria - this speeds up the database a lot.
Once you are finished with your activity you can also export the data back into a csv file. Have a look into the H2 Documentation for CSV Import / Export for further details on this and other topics.

Get Social


(c) 2024, by bytebang e.U. - Impressum - Datenschutz / Nutzungsbedingungen
-