Reblog: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets

No full blog post this time. I just wanted to point you towards an excellent article I read yesterday, by Joel Spolsky: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

The article provides a short and highly readable intro to the different ways text can be encoded, and why the concept of “plain text” is not only a myth, but a dangerous myth. The article itself dates from 2003, but the information is definitely still relevant, and I am ashamed to say that I have been ignorant of the implications of text encodings for a long time. I don’t know if Joel’s threat to make me “peel onions for 6 months in a submarine” is still valid, but if it is, I best start packing.

It’s an absolute must read if you ever write any code at all. It won’t give you all the answers, but will hopefully leave you knowing what you don’t know, and that, I believe, is half the battle.

Now go click that link!

Posted in Software Development | Tagged , , | Leave a comment

Simple Python script that runs SQL scripts against a given SQLite database

Many of the search hits for my blog are along the lines of “python sqlite script”. I really hope those folks find what they are looking for.

I thought I might add this really simple script to run SQL script files against a SQLite database. I use this script all the time. Especially when I only want to make a tiny change to my database to test something, I find find it faster and more convenient to quickly type the SQL into a text file and run a python script from the command line, than it would have been through some SQLite interaction GUI. It becomes especially convenient if I use it alongside LINQPad – I can make changes using the command line, and run queries via LINQPad.

The script is really simple and requires no explanation, just some warnings. It targets Python 2.7 and won’t work in Python 3 as is. The exception handling does nothing except tell you “something” went wrong.

It doesn’t do queries – it only runs scripts against the database but reads no values back. But you can use it for create, insert, update and delete.

The script should be ready to use as is, and the command line is as follows:

python.exe .\RunSQLiteScript.py script.sql database.sqlite

If the database file does not exist, it will be created, so you can use this script to create a new SQLite database.

Below is the code, hosted in a gist – if you improve on this by adding better exception handling or whatever, please submit the changes, so we can all benefit Smile

Hope this helps!

Posted in Software Development | Tagged , , , , | 2 Comments

Hootsuite Hootlet causing problems with WordPress code snippets

Last night I reviewed one of my blog posts, and noticed that the source code samples were all scrambled in some places. I eventually figured out that it was the #region and #endregion tags that got messed up. I was surprised that I had not noticed it before, and changed the region tags to comments as a temporary solution.

This morning, on closer inspection, I noticed something suspicious in the scrambled snippets…

The original

Below is an actual code snippet – whether it is scrambled or not will depend on your browser.

public class MyClass
{
	#region Constructor

	public MyClass() { }

	#endregion
}

Here is a screenshot of the code for that snippet:

CropperCapture[4]

The problem

When I view that snippet as a draft post in Google Chrome, I get the following result:

CropperCapture[2]

Those class=”_hootified” statements were what made me suspicious. I had recently installed Hootsuite’s Hootlet browser extension for Google Chrome. I know Chrome uses no extension in incognito mode, so I popped open an incognito window and loaded the exact some page:

CropperCapture[3]

Perfect!

So apparently Hootlet is trying to do something clever with anything marked with a #, probably assuming it’s a Twitter hashtag (like there existed no hashes online pre-Twitter!).

I’m a bit relieved that the error wasn’t in my actual blog source, but I am also concerned that the rendering of those snippets are so unpredictable – how sure can I be that my readers are seeing my code snippets the way I intended?

I’m also not sure who  I should take this up with. Hootsuite? Chrome? WordPress? Rich Hewlett who wrote the WordPress source code plugin for Windows Live Writer? (Thanks Rich, it’s awesome!)

I’ll post an update once I got this figured out!

Update:

I posted about this on the HootSuite forum, and I’m impressed – I got a response in less than 12 hours. They suggest turning off the “Link Twitter status and user profile handles into the HootSuite dashboard” feature by right clicking on the extension icon and selecting Options:

While I appreciate the workaround, I hope they fix this on their end – another reader of my might have the same extension installed and not know about this effect, assuming instead that my code is quite bizarre!

Posted in General Technical | Tagged , , , , , , | 2 Comments

Organising custom Commands in a WPF TreeView using the Model-View-ViewModel design pattern

In my current project’s interface, the main elements of the GUI are a TabControl holding any number of workspaces, and a side panel with commands to launch the desired workspaces. It is closely based on a MVVM demo I saw, where the commands are displayed as a list of hyperlinks. However, since I have many possible workspaces, and they are logically grouped together, I wanted the side panel to list the available workspaces in a collapsible TreeView, and have the user open the desired workspace by clicked the corresponding command link.

In this post I will show you how I put together a tree of commands that can be modified at runtime, using the Model-View-ViewModel (MVVM) pattern. The result looks something like this:

CropperCapture[2]

The Model-View-ViewModel Design Pattern

MVVM is an interface design pattern is ideally suited for use with WPF, due to the way WPF GUI code binds to data and refreshes the display. I am not going to explain or demonstrate the MVVM pattern here – if you are unfamiliar with it, check out Josh Smith’s excellent introduction over at MSDN. My code here is based closely on the Demo Application code available in that article.

Putting Commands in ViewModels

RelayCommand

In his article’s demo code, Josh uses a RelayCommand class, which implements ICommand, to wrap actions that can be invoked by the View. The ViewModel then hosts a RelayCommand, and in the View, a button or hyperlink binds to the RelayCommand. Some action is performed by the ViewModel when the command is invoked. You can find the source code for the RelayCommand class in Figure 3 of Josh’s article. Here I will just show how it is used, as this will make the rest of this post make more sense. Below is the ViewModel’s initialisation of a RelayCommand, and beneath that the binding to that command by the View:

// In the ViewModel:

private RelayCommand _demoCommand;

// Command Property:
public RelayCommand DemoCommand
{
    get
    {
        if (_demoCommand == null)
        {
            // Initialize the command using a lambda expression:
            _demoCommand = new RelayCommand(param => DemoCommandTarget());
        }

        return _demoCommand;
    }
}

private void DemoCommandTarget()
{
    // Do stuff that you want the command to do...
}
        <!-- In the View: -->

        <Button Command="{Binding Path=DemoCommand}">Click Me!</Button>

CommandViewModel

Josh then goes a step further by declaring a ViewModel for commands, allowing him to define a default DataTemplate for displaying command links to the user. The CommandViewModel wraps the RelayCommand, which is injected upon construction. It further exposes a DisplayName property (which is implemented in the ViewModelBase class). Binding in the View is then to the CommandViewModel, rather than directly to the RelayCommand. Below is a slightly simplified version of the CommandViewModel from Josh’s article, which we are going to be modifying soon:

// Represents an actionable item displayed by a View.
public class CommandViewModel : ViewModelBase
{
    public CommandViewModel(string displayName, ICommand command)
    {
        base.DisplayName = displayName;
        this.Command = command;
    }

    public ICommand Command { get; private set; }
}

Grouping Commands into a Tree

In order to group CommandViewModels into a tree structure, we declare two more classes. We declare CommandTreeItem as a new abstract (and empty) baseclass for CommandViewModel, as well as a new CommandTreeGroup class. We let the CommandTreeGroup class expose an ObservableCollection of CommandTreeItem baseclass instances, thus allowing it to hold both CommandViewModel instances and more CommandTreeGroup instances. The code is below, but if your head is spinning from all these class names (mine is), check the image below the code for a class diagram that hopefully makes everything clear:

public class CommandTreeGroup : CommandTreeItem
{
    //region Fields

    private ObservableCollection<CommandTreeItem> _commandTreeItems;

    //endregion

    //region Constructor

    public CommandTreeGroup(string displayName)
    {
        this.DisplayName = displayName;
    }

    //endregion

    //region Properties

    public ObservableCollection<CommandTreeItem> CommandTreeItems
    {
        get
        {
            if (_commandTreeItems == null)
                _commandTreeItems = new ObservableCollection<CommandTreeItem>();

            return _commandTreeItems;
        }
    }

    //endregion
}

The updated class structure now looks like this:

Class_Diagram

Initializing the Command Tree

We can now expose an ObservableCollection of CommandTreeItems in the MainWindowViewModel (lines 19-32), and fill it up with commands, groups of commands and subgroups of commands (lines 39-63). The code below creates the commands shown in the example tree at the very top of this post:

/// <summary>
/// ViewModel for application's main window.
/// </summary>
public class MainWindowViewModel : ViewModelBase
{

    //region Constructor

    public MainWindowViewModel()
    {
        base.DisplayName = "MVVM Command Tree Demo";
        CreateInitialCommands();           
    }

    //endregion

    //region Properties

    private ObservableCollection<CommandTreeItem> _commands;

    public ObservableCollection<CommandTreeItem> Commands
    {
        get
        {
            if (_commands == null)
                _commands = new ObservableCollection<CommandTreeItem>();

            Console.WriteLine("Commands queried, returning {0} items", _commands.Count);

            return _commands;
        }
    }

    //endregion

    //region Command Creation

    private void CreateInitialCommands()
    {
        // Create a groupless command:
        CommandViewModel groupLessCommand = new CommandViewModel("Groupless Command", new RelayCommand(param => Console.WriteLine("Clicked Groupless Command")));
        this.Commands.Add(groupLessCommand);

        // Create a Group:
        CommandTreeGroup group1 = new CommandTreeGroup("Command Group");
        this.Commands.Add(group1);

        // Add some Commands to the Group:
        CommandViewModel commandA = new CommandViewModel("Command A", new RelayCommand(param => Console.WriteLine("Clicked Command A.")));
        group1.CommandTreeItems.Add(commandA);

        CommandViewModel commandB = new CommandViewModel("Command B", new RelayCommand(param => Console.WriteLine("Clicked Command B.")));
        group1.CommandTreeItems.Add(commandB);

        // Create an inner group inside the first group:
        CommandTreeGroup innerGroup = new CommandTreeGroup("Inner Group");
        group1.CommandTreeItems.Add(innerGroup);

        // Add a command to the inner group:
        CommandViewModel innerCommandB = new CommandViewModel("Add More Commands", new RelayCommand(param => this.AddCommand()));
        innerGroup.CommandTreeItems.Add(innerCommandB);            
    }

    private void AddCommand()
    {
        // Add a second groupless command:
        CommandViewModel newGrouplessCommand = new CommandViewModel("New Groupless Command", new RelayCommand(param => Console.WriteLine("Clicked New Groupless Command")));
        this.Commands.Add(newGrouplessCommand);
    }

    //endregion

}

In the code above, most commands are just lambdas that print out the command name when invoked, with the exception being innerCommandB, which calls AddCommand() to add another command to the tree at run time. Because the root list and the sub lists are all implemented as ObservableCollections, the View will automatically update to show the additional command.

Displaying the Command Tree

The rest of the niftiness happens over in the View, which really isn’t all that complicated. Here is the complete view code, with comments below:

<Window x:Class="MVVM_CommandTree_Demo.View.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:vm="clr-namespace:MVVM_CommandTree_Demo.ViewModel"
        Title="{Binding DisplayName}" Height="600" Width="800">
    <DockPanel>
        <TreeView ItemsSource="{Binding Path=Commands}">
            <TreeView.Resources>

                <DataTemplate DataType="{x:Type vm:CommandViewModel}">
                    <TreeViewItem>
                        <TreeViewItem.Header>
                            <Hyperlink Command="{Binding Path=Command}">
                                <TextBlock Text="{Binding Path=DisplayName}" />
                            </Hyperlink>
                        </TreeViewItem.Header>
                    </TreeViewItem>
                </DataTemplate>

                <DataTemplate DataType="{x:Type vm:CommandTreeGroup}">
                    <TreeViewItem Header="{Binding Path=DisplayName}"
                                  ItemsSource="{Binding CommandTreeItems}" />
                </DataTemplate>

            </TreeView.Resources>
        </TreeView>                       
    </DockPanel>
</Window>

We start off in line 7 with a TreeView, for which we bind ItemsSource to the ObservableCollection of CommandTreeItem instances in the MainWindowViewModel, to form the root of our command tree.

We then declare two DataTemplates to tell the TreeView how to display a CommandViewModel and a CommandTreeGroup respectively.

DataTemplate for Commands

The first DataTemplate, for CommandViewModel, is intended to display the tree’s leaf nodes – the actual commands. It renders a TreeViewItem with a Hyperlink as header. The Hyperlink Command binds to CommandViewModel.Command, while the Hyperlink Text binds to CommandViewModel.DisplayName:

<DataTemplate DataType="{x:Type vm:CommandViewModel}">
    <TreeViewItem>
        <TreeViewItem.Header>
            <Hyperlink Command="{Binding Path=Command}">
                <TextBlock Text="{Binding Path=DisplayName}" />
            </Hyperlink>
        </TreeViewItem.Header>
    </TreeViewItem>
</DataTemplate>

DataTemplate for Command Groups

The second DataTemplate, for CommandTreeGroup, is intended to display sub groups of commands. It renders a TreeViewItem with a plain text Header, bound to CommandTreeGroup.DisplayName. In addition, the TreeViewItem’s ItemsSource is bound to the ObservableCollection of CommandTreeItems in the group, thus creating the next set of branches and leaves:

<DataTemplate DataType="{x:Type vm:CommandTreeGroup}">
    <TreeViewItem Header="{Binding Path=DisplayName}"
                  ItemsSource="{Binding CommandTreeItems}" />
</DataTemplate>

That’s it!

Super simple stuff! I hope that was useful. You can download the complete demo code here. Please leave a comment below, and let me know if something didn’t make sense or if you have ideas of awesomeness to add!

Update: I get quite a number of hits on this page, as well as a number of downloads for the demo code, and I’m always left wondering “Did you find what you were looking for? Did this help you?” If it did (or did not), please leave a comment, I would love to know!

Posted in Software Development | Tagged , , , , , , , , , , , | Leave a comment

Quick and easy way to insert a PDF graph or diagram into a Microsoft Word Document without losing too much quality

I had to solve this one for my boss today, and thought I would make a quick post with the solution I found.

The Image Quality Problem

In our line of work, when we want to include a PDF in a Word document, 9 out of 10 times it is a diagram or a graph. Word does not take kindly to this (I’m using Word 2010, Your Mileage May Vary). The image quality of an imported PDF document is greatly reduced when inside a word document.

The solution I found is to print the PDF diagram to a PNG image, using the Bullzip PDF Printer, and then to insert the PNG image in the Word document. I’ll use a series of screen shots to illustrate the problem and the solution.

Below is a screenshot of the original PDF diagram. Isn’t it lovely?

Original

Inserting the PDF directly

There are two ways of directly inserting this PDF in a Word Document. One is to simply drag and drop the PDF file onto Word, the other is to use Insert – Object. Whichever way you choose, the result has really low quality:

FromPDF

(In case you’re wondering, I took the above screenshot after printing the Word document to PDF, so that is the actual output quality.)

Converting the PDF to PNG before inserting it

If we first print the PDF to PNG and then import it to Word, the result looks like this:

FromPNG

Much better!

How to do it

First of all, download and install Bullzip PDF Printer from here. It’s excellent, and it’s free! After installation, it will show up as a printer on your machine.

Next, open the original PDF diagram in Acrobat Reader, and go to File – Print. On the Print dialog, set the Printer to Bullzip PDF Printer. Also set the Size Options as shown below, to avoid ending up with lots of white space around your diagram:

PrintSettings

After hitting Print, the Bullzip PDF Printer – Create File dialog will pop up. Here the important thing to change is the Format, which we set to PNG:

BullzipSettings

You can also set the output File Name. After hitting Save, you are left with a *.PNG file at the location of your choosing. You can now simply drag and drop this image onto Word to have much higher image quality than the PDF would have given you. There you go!

The way it should be

All of this back of forth between Python (which I used to generate the plot), Acrobat Reader, Bullzip and Word makes me miss them good-ol’-days, when I was working on my thesis.

Back then, working on a Linux machine,  I had Python generate plots and export them to *.eps (Encapsulated PostScript) files, which could be linked directly into my LaTeX document for perfect output to a *.ps or *.pdf document. The greatest advantage of that approach is that updates to the plot can be pushed all the way to your final document by running a single script, with no manual PNG printing, changing of settings, or dragging and dropping. That really is the way it should be, especially when working with plots and diagrams that depend on data, which might change at the last minute.

Posted in General Technical | Tagged , , , , , , , | 6 Comments

A Python script that finds and runs other Python scripts (which I use to build my SQLite database)

My, what an unwieldy blog title.

While working on my current project, I wanted to be able to generate and fill my SQLite database from scratch, in an instant, by calling a single script. I found a convenient way of doing this is to have a folder of Python scripts – each script does something to the database, typically creating a table and filling it with initial data which it reads from a text file.

As development continues, I’m constantly adding more tables or more data, so  I wanted new scripts in that folder to automatically be include in the database building process. To do this, I wrote another script to find and run everything it finds in that folder. This has been very convenient, and has saved me a lot of effort over the past couple of months. It works like this:

  1. I call RunDatabaseScripts.py, giving it a reference to the database file and a reference to the folder containing the scripts I want to run.
  2. RunDatabaseScripts.py scans the script folder for files that match Script*.py, and loads them all.
  3. RunDatabaseScripts.py opens a connection to the database. It will create the file if it does not already exist.
  4. RunDatabaseScripts.py iterates over the loaded scripts, passing each of them the open database connection.
  5. RunDatabaseScripts.py closes the database and prints a little run report.

Here is the directory structure for a contrived example with only two database scripts:

root
 |
 |-- RunDatabaseScripts.py
 |
 |-- scripts
       |
       |-- Script_One.py
       |
       |-- Script_Two.py
       |
       |-- __init__.py

The Database Scripts

Script_Two.py

Before we get to RunDatabaseScripts.py, lets have a look at the subject scripts that are run by it. Here is the code for Script_Two.py, a simple script that creates a new table in the database:

Name = "Script Two"

def CanRun(connection, listOfExecutedScriptNames):
    ''' Return False if this script can not yet run, True if it can.'''
    return True

def Run(connection):
    print "\n\tStarting %s Script" % Name

    dbCursor = connection.cursor()

    # Create table:
    SQLScript = r"""
    DROP TABLE IF EXISTS DemoTable;
    CREATE TABLE DemoTable (    [ID] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
                                [DemoString] TEXT NOT NULL ) """

    dbCursor.executescript(SQLScript)
    connection.commit()

    print "\tCreated DemoTable"

    print "\tEnd of %s Script" % (Name)

There is a bit an interface that RunDatabaseScripts.py expects the database scripts to adhere to:

  • The Name property in line 1 is needed by RunDatabaseScripts.py for reporting failures and for ensuring the scripts execute in the right order.
  • The CanRun function in line 3 is called by RunDatabaseScripts.py before the script is run – the script can implement custom code here to make sure any prerequisites have been satisfied. It receives an open connection to the database, and a string list of Names of those scripts which have already been run. Script_One.py below demonstrates the use of this.
  • The Run function in line 7 is called by RunDatabaseScripts.py, and is where the meat of the script goes. It receives an open database connection.

Script_One.py

Below is the code for Script_One.py, which adds some data rows to the table created by Script_Two.py:

Name = "Script One"

def CanRun(connection, listOfExecutedScriptNames):
    ''' Return False if this script can not yet run, True if it can.'''
    if "Script Two" not in listOfExecutedScriptNames:
        return False
    else:
        return True

def Run(connection):
    print "\n\tStarting %s Script" % Name

    dbCursor = connection.cursor()

    changesPre = connection.total_changes # Count Changed Rows

    dataToAdd = ["These", "Are", "Some", "Strings", "To", "Add", "To", "The", "Database"]

    for word in dataToAdd:
        # Write to DB:
        cmd = "INSERT INTO DemoTable VALUES (null, ?)"
        values = (word,)
        dbCursor.execute(cmd, values)

    connection.commit()

    changesPost = connection.total_changes # Count Changed Rows
    changes = changesPost - changesPre # Count Changed Rows

    print "\tAdded some rows to DB"

    print "\tEnd of %s Script" % (Name)

Of interest here is the fact that CanRun now checks that Script_Two.py has already been run – it needs the table to be created before it can add rows. CanRun uses the Name assigned to Script_Two.py and the listOfExecutedScriptNames received from RunDatabaseScripts.py to check this. If Script_Two.py has not been run, CanRun returns False – RunDatabaseScripts.py will then move on to the other scripts before returning to this one.

__init__.py

This empty file is needed for Python to recognise the scripts  folder as an importable module.

The launcher script: RunDatabaseScripts.py

Below are the two major functions of the launcher script. You can view and download the full script code here.

Loading the script files as modules

The LoadScripts function scans the script folder for Script*.py files, loads them, and returns a list of loaded modules:

def LoadScripts(scriptFolder):
    ''' Returns a list of module references '''
    # Find Script Files:
    scriptFiles = glob.glob(os.path.join(scriptFolder, "Script*.py"))
    print "Found %d Script Files" % (len(scriptFiles))

    # Load scripts:
    loadedScripts = []
    for scriptFile in scriptFiles:
        try:

            # Get full module name:
            scriptName = os.path.basename(scriptFile)
            scriptName = os.path.splitext(scriptName)[0]
            folderName = os.path.basename(scriptFolder)
            fullModuleName = "%s.%s" % (folderName, scriptName)

            # Import Script:
            script_module = __import__(fullModuleName, fromlist=["fullModuleName"])

            # Store reference to Script:
            loadedScripts.append(script_module)

        except Exception, e:
            print "Something went wrong while loading script file \'%s\'" % scriptFile
            print e

    return loadedScripts

Line 21 forms the full module name of each script, which consists of the folder name and the filename, e.g. scripts.Script_One. It pops the module name in a variable, and the name of this variable is then put in a single element list given to the actual __import__ call in line 24.

If the subject script file fails to parse due to syntax errors or such, the error will be caught in line 29, and the next script loaded.

Running the loaded scripts

Actually running the loaded script is a simple matter of getting a reference to the module object created in line 24 above, and calling the function you want, in this case the Run function shown in the Script_One.py and Script_Two.py samples:

script.Run(connection)

Most of the code in the RunScripts function deals with making sure each script gets run, that they run in the right order, that there are no circular dependencies between scripts, and that the results are reported. Here is the code, with some comments below:

def RunScripts(databaseFilename, loadedScripts):
    try:
        # Open DB:
        connection = sqlite.connect(databaseFilename)

        # Keep a list of completed scripts so scripts can test if they are ready to run:
        namesOfCompletedScripts = []
        success = True
        # Keep passing over list until everything has been run:
        while (len(namesOfCompletedScripts) < len(loadedScripts)):
            nothingExecutedOnThisPass = True
            # Iterate over scripts:
            for script in loadedScripts:
                try:
                    # Check if this script has been run or not:
                    if (script.Name not in namesOfCompletedScripts):
                        # Ask script if it is ready to run -
                        # - give it a list of completed scripts:
                        if (script.CanRun(connection, namesOfCompletedScripts)):
                            # Run script:
                            script.Run(connection)
                            namesOfCompletedScripts.append(script.Name)
                            nothingExecutedOnThisPass = False
                        else:
                            print "\n%s not ready to run" % script.Name

                except Exception, e:
                    print "Something went wrong while running script %s" % script.Name
                    print e
                    success = False

            if (nothingExecutedOnThisPass):
                # If nothing executed on this pass, we're stuck -
                # - the remaining scripts won't run
                break

        print "\nThe following scripts were successfully run:"
        print namesOfCompletedScripts

        if (len(namesOfCompletedScripts) < len(loadedScripts)):
            print "Some scripts failed:"
            for script in loadedScripts:
                if (script.Name not in namesOfCompletedScripts):
                    print "\t" + script.Name

    except Exception, e:
        print "Something went wrong while applying scripts to DB"
        print e
    finally:
        # Close DB:
        connection.close()

The database connection

The database connection is opened in line 38. This connection is then passed to each script’s CanRun function in line 53 and to each scripts Run function in line 55. Finally, it is closed again in line 85.

Running scripts in the correct order

Some scripts may depend on database elements created by other scripts – in this example, Script_Two.py must create the DemoTable table before Script_One.py can add rows to it.

Each script is allowed to check it’s own prerequisites in its CanRun function, called in line 53. To assist it, RunScripts keeps a list of successfully completed scripts’ names in namesOfCompletedScripts. It also provides CanRun with a database connection, in case the subject script wants to manually check the structure of the database.

If a script indicates that it is not ready to be run, RunScripts skips over it and continues iterating over the rest of the scripts. It then keeps returning for another pass, revisiting all those scripts not yet listed in namesOfCompletedScripts. This includes both those scripts that were previously not ready to run, and those that raised errors during execution.

In the two script example given here, the first pass skips over Script_One.py due to Script_Two.py not having been run. On the second pass, Script_One.py detect that its prerequisites are satisfied, and successfully executes.

RunScripts will continue to pass over the list of scripts until all have been executed. However, a CanRun requirement that cannot be satisfied, or a script that persistently raises an error upon execution, will cause this loop to be infinite. To halt execution in these cases, RunScripts monitors the boolean variable nothingExecutedOnThisPass, which is set to True at the start of each pass, and negated as soon as a script successfully completes. If an entire pass completes without a single script being successfully run, the remaining scripts will never be ready to run. Execution is halted, and the list of failed scripts is printed in lines 72-76.

There you have it

I found this an easy and straightforward way to automate the execution of new scripts. It frees the developer from manually adding scripts to the execution list, and also solves order-of-execution problems automatically. While it was used for database building here, I’m sure to use the same code in future for any scenarios where I have a rapidly changing list of scripts to run.

If you find this helpful, or have any cool ideas, please do leave a comment!

UPDATE 2013-02-28: I figured out that pysqlite2 is deprecated and has been replaced by the sqlite3 library, which is bundled with Python 2.5 and up by default, and thus requires no download. I’ve removed the links to pysqlite2 and modified the downloadable script to use sqlite3 instead.

Posted in Software Development | Tagged , , , , , | 2 Comments

Python Script to Count Tables, Columns and Rows in SQLite Database

One of my colleagues asked me for some metrics regarding the size of our project’s database to put in a report, so I wrote a short little script Python to summarize the dimensions of SQLite tables. It prints a list of the tables in the database, along with the number of columns, rows and cells in each.

The command line is as follows:

python.exe DatabaseSizeSummary.py database.sqlite output.txt

Output is to a specified output file, or to standard out if no file is specified. It’s tab delimited for easy Excel import. Here is a sample of the output text:

TableName	Columns	Rows	Cells
Boreholes	25	820	20500
Canals	28	14	392
Pipelines	25	785	19625
Reservoirs	27	387	10449
Siphons	23	23	529

Number of Tables:	5
Total Number of Columns:	128
Total Number of Rows:	2029
Total Number of Cells:	51495

And here is the same output with a little bit of Excel formatting:

output screenshot

Interesting code includes getting the list of tables:

tableListQuery = "SELECT name FROM sqlite_master WHERE type='table' ORDER BY Name"
cursor.execute(tableListQuery)
tables = map(lambda t: t[0], cursor.fetchall())

and getting the number of columns in each table:

columnsQuery = "PRAGMA table_info(%s)" % table
cursor.execute(columnsQuery)
numberOfColumns = len(cursor.fetchall())

You can download the script here, and if you add any nifty features, do share!

PS: This was the first time I wanted to share a code file on my blog, and I didn’t know what the best way to do so was. I tried Google Drive, but couldn’t get a direct download link to the file. I also considered my Bitbucket account, but ended up using the public folder of my Dropbox account. However, I don’t like the URL much, and I’m sure there is a more generally accepted way of doing this. What do you use to share code files on your blog?

Update: I’m trying out gist.github.com, so below is a version controlled, editable version of the script. Thanks to @simondlr and @konradblum for telling me about this awesome service – you can read more about it’s integration with WordPress here: en.support.wordpress.com/gist/

UPDATE 2013-02-28: I figured out that pysqlite2 is deprecated and has been replaced by the sqlite3 library, which is bundled with Python 2.5 and up by default, and thus requires no download. I’ve removed the links to pysqlite2 and modified the downloadable script to use sqlite3 instead.

 

Posted in Software Development | Tagged , , , , | 1 Comment