SQLite Tutorial for iOS: Creating and Scripting

A SQLite tutorial series for iOS. This part focuses on creating a SQLite database with a command line tool. By Ray Wenderlich.

Leave a rating/review
Save for later
Share
You are currently viewing page 2 of 3 of this article. Click here to view the first page.

Creating Our Database : Via Python!

Python also comes with a built-in sqlite3 library that makes it super easy to interact with sqlite3 databases.

Python should be pre-installed on your Mac. Create a file named parseBanklist.py in the same directory that your CSV file is in, and start by adding the following code:

import sqlite3;
from datetime import datetime, date;

conn = sqlite3.connect('banklist.sqlite3')
c = conn.cursor()
c.execute('drop table if exists failed_banks')
c.execute('create table failed_banks(id integer primary key autoincrement, 
    name text, city text, state text, zip integer, close_date text, updated_date text)')

The first line imports the sqlite3 module, and the second imports some datetime functionality we’ll need later.

Then we open up a connection to our SQLite DB by simply passing in the name of the file. We then create a cursor, which we can use to execute SQL statements or move through rows of results.

Then we call execute on the cursor to run two different SQL statements: one to drop any existing table by the name of failed_banks, and the second to create our table.

It’s good practice to have a primary key for database tables – so you can quickly access data by the primary key, for example. The “name” field wouldn’t be a good choice for a primary key, because banks could have the same name (such as if they were from different states).

So instead we create our own id field that is a unique number for each row. We mark it as auto-increment so the SQLite engine will handle assigning each row an incrementing number as we add data.

Also note that there is no date data type for SQLite. However, according to the SQLite datatype reference, as long as you put dates in a particular format (one of which is a text string in the format “YYYY-MM-DD HH:MM:SS.SSS”), SQLite’s date and time functions will be able to handle the values as dates – so that is what we’ll do!

At this point, you can run your Python script with “python parseBanklist.py” and then use the sqlite3 utility to verify that the database table has been created. Next, to add some data!

Inserting Data via Python!

Python has a neat function called string.split() where you can specify a delimiter, and Python will break the string up into an array of substrings. You can see this at work with the following:

>>> a = "Unity National Bank,Cartersville,GA,34678,26-Mar-10,26-Mar-10"
>>> a.split(",")
['Unity National Bank', 'Cartersville', 'GA', '34678', '26-Mar-10', '26-Mar-10']

However, there is no built-in way (that I know of) to have Python handle quotes as escapes for separation, in order to handle the “La Jolla Bank, FSB” example shown above. So let’s write our own little function to split up a line that handles quote escaping:

def mysplit (string):
    quote = False
    retval = []
    current = ""
    for char in string:
        if char == '"':
            quote = not quote
        elif char == ',' and not quote:
            retval.append(current)
            current = ""
        else:
            current += char
    retval.append(current)
    return retval

This is quite straightforward Python here – we simply go through the string character by character, building up the list of sub-items separated by commas along the way.

Now that we have this, we can go line-by-line through our CSV file, inserting data into our database:

# Read lines from file, skipping first line
data = open("banklist.csv", "r").readlines()[1:]
for entry in data:
    # Parse values
    vals = mysplit(entry.strip())
    # Convert dates to sqlite3 standard format
    vals[4] = datetime.strptime(vals[4], "%d-%b-%y")
    vals[5] = datetime.strptime(vals[5], "%d-%b-%y")
    # Insert the row!
    print "Inserting %s..." % (vals[0])
    sql = "insert into failed_banks values(NULL, ?, ?, ?, ?, ?, ?)"
    c.execute(sql, vals)

# Done!
conn.commit()

In the first line, we just open up the CSV in read mode, read all of the lines, and skip the first line (because it contains a header we don’t want to read).

We then iterate through each line, and use our helper function to split the line into pieces.

We use the Python datetime module to convert the CSV data (in a format such as “26-Mar-10”) into a Python datetime object that we could then easily format however we want. But we’ll just take the default, which happens to output the dates in the format SQLite expects.

Finally, we create our SQL statemen, inserting question marks wherever user-specifie parameters should be, and then execute the statement passing in the users-specified values. We pass NULL for the first parameter (the id field) so that SQLite can generate the auto-incremented ID for us.

And that’s it! You should be able to run sqlite3 to verify the data is correctly in the database:

sqlite> select * from failed_banks limit 3;
1|Desert Hills Bank|Phoenix|AZ|57060|2010-03-26...
2|Unity National Bank|Cartersville|GA|34678|2010-03-26...
3|Key West Bank|Key West|FL|34684|2010-03-26...

Gratuitous Bar Graph

Once I’d gotten this far, I couldn’t stop myself from extending the script to make a little bar graph showing the number of failed banks by the year.

If you’re just interested in learning sqlite3, feel free to skip this step. But if you like making cool bar charts for no apparant reason, here’s how :]

  • Install the latest 2.6 version of Python from python.org. Yes, the Mac already comes installed with Python, but this makes it so that the third party libraries we’re about to install install smoothly/easily. At the time of writing, I used the Python 2.65 Mac Installer.
  • Install the latest version of Numerical Python. I used the 1.3.0 py26 Mac installer.
  • Install the latest version of Matplotlib. I used the 0.99.1 py26 Mac installer.

Once you’ve got that installed, you can add the following to the bottom of your Python script:

# Get failed banks by year, for fun
c.execute("select strftime('%Y', close_date), count(*) from failed_banks group b
y 1;")
years = []
failed_banks = []
for row in c:
    years.append(row[0])
    failed_banks.append(row[1])

# Plot the data, for fun
import matplotlib.pyplot as plt
import numpy.numarray as na

values = tuple(failed_banks)
ind = na.array(range(len(values))) + 0.5
width = 0.35
plt.bar(ind, values, width, color='r')
plt.ylabel('Number of failed banks')
plt.title('Failed banks by year')
plt.xticks(ind+width/2, tuple(years))
plt.show()

If all goes well, you should see a cool chart like this:

Failed Banks Since 2000 Graph

I wonder what will this look like by the end of 2010?