You might need to store data at the package level, take a look at the following package structure. I am going to show you how to create, read and write to a database at the package level. This means that when a user installs your package, you could store data in the python package.

We’re going to build on the work we did here… https://rexbytes.com/2022/11/19/python-packaging-reading-writing-to-datafiles-inside-a-package/

pkgexampledatabases/
├── LICENCE
├── pyproject.toml
├── README.md
├── setup.py
└── src
    └── pkgexampledatabases
        ├── appconfig.yaml
        ├── data
        │   ├── config
        │   │   ├── __init__.py
        │   │   ├── packagedb.json
        │   │   └── userdb.json
        │   ├── __init__.py
        │   ├── sampledata
        │   │   ├── __init__.py
        │   │   ├── shoppinglist.csv
        │   │   └── store.csv
        │   └── storemart.db      <------- HERE IS AN SQLITE DB FILE
        ├── __init__.py
        ├── myappconfigmanager.py
        ├── mycsvhelper.py
        ├── mydatabasemanager.py
        ├── my_database_module.py
        ├── mypathmanager.py
        └── mysqlitemanager.py

Play Along

If you have been following all previous articles, you should be now very comfortable with python packaging, console commands, arg parsing and publishing to pypi.
You are almost a pro.

I’ve already implemented the following examples, and you can look at ( cut n paste ) the code from github.

Available On Github

Available To Install From PyPi.org

You can also directly install this example package and play along.

https://pypi.org/project/pkgexampledatabases/

(databases) ubuntu@goodboy:~$ python3 -m pip install pkgexampledatabases
Collecting pkgexampledatabases
  Using cached pkgexampledatabases-0.0.5-py3-none-any.whl (10 kB)
Installing collected packages: pkgexampledatabases
Successfully installed pkgexampledatabases-0.0.5
(databases) ubuntu@goodboy:~$ 

Install right now using the above install command.

Play Along Runthrough

(databases) ubuntu@goodboy:~$ rexdbp --help
usage: rexdbp [-h] [-p] [-u] [-l] [-s] [-f] [-L] [-S] [-X]

A package datafiles example

options:
  -h, --help         show this help message and exit
  -p, --createpdb    Create package db from config file.
  -u, --createudb    Create user db from config file.
  -l, --importlist   Import shopping data to user home dir database
  -s, --importstore  Import store data to package dir database
  -f, --files        Show the full filepaths of all our files
  -L, --outputlist   Output current data in users shopping list database
  -S, --outputstore  Output current data in package store database
  -X, --deletedbs    Delete package and user databases
(databases) ubuntu@goodboy:~$ 

Create Databases

Inside the above package you will find two subdirectories in the data directory.

The config directory contains two files, each describes a database using json. One database will be created in the package, and the other for the fun of it will be created in the users home directory.

The sampledata directory contains sample data which will be inserted in to the databases.

Shoppinglist.csv will be inserted into the users database.
Store.csv will be inserted into the package database.

I hope the implementation is clear enough in the github repo. Generating databases from config files is very convenient.

Running the two commands from the help descriptions above, creates one database in the package, and one in the user home directory.

(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -p
CREATED: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/storemart.db
(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -u
CREATED: /home/ubuntu/.rexdbp/shopping.db
(databases) ubuntu@goodboy:~/myrepos/rex$ 

We can now import test data into those databases using the -l and -s options. The package checks to make sure that the databases exist before importing, which is why you get the same message again.

(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -l
CREATED: /home/ubuntu/.rexdbp/shopping.db
(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -s
CREATED: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/storemart.db
(databases) ubuntu@goodboy:~/myrepos/rex$ 

Let’s output the database contents,

(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -L
('banana', 1)
('apple', 2)
('orange', 5)
('pasta', 1)
(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -S
('banana', 0.4)
('apple', 0.25)
('orange', 0.2)
('pasta', 1.99)
(databases) ubuntu@goodboy:~/myrepos/rex$ 

and how about listing ALL of the datafiles and databases?

I’m running this package inside a virtual env, which is why the site-package is located where it is.

(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -f
app_config_path: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/appconfig.yaml
packagedb_configfilepath: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/config/packagedb.json
userdb_configfilepath: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/config/userdb.json
shoppinglist_datafile: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/sampledata/shoppinglist.csv
store_datafile: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/sampledata/store.csv
package_databasepath: /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/storemart.db
userdb_databasepath: /home/ubuntu/.rexdbp/shopping.db
(databases) ubuntu@goodboy:~/myrepos/rex$ 

Let’s delete the databases we just created, and try to output their data.

(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -X
(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -S
/home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/storemart.db does not exist. Please create it. Check --help
(databases) ubuntu@goodboy:~/myrepos/rex$ rexdbp -L
/home/ubuntu/.rexdbp/shopping.db does not exist. Please create it. Check --help
(databases) ubuntu@goodboy:~/myrepos/rex$ 

Implementations

Please check the code at github for full implementations. Here are a couple of things/gotchas you need to watch out for.

pyproject.toml Entry

You must list every directory that contains any data files you want to refer to in your python code, you should list these directories in pyproject.toml.
e.g. your package directory ‘/pkgexampledatabases/data/config’/ is listed in namespace format ‘pkgexampledatabases.data.config’, you can specify the file types to be packaged.. you can choose all * if you wanted.

You need to do this so that these files are included in the package you are creating.

More info on packaging with setup tools here https://setuptools.pypa.io/en/latest/userguide/datafiles.html and here https://rexbytes.com/2022/08/28/python-packaging-with-setuptools/

[tool.setuptools.package-data]
"pkgexampledatabases.data"=["*.json","*.db"]
"pkgexampledatabases.data.config"=["*.json"]
"pkgexampledatabases.data.sampledata"=["*.csv"]
"pkgexampledatabases"=["*.yaml","*.db"]

Get Absolute Directory Path For DataFile Directory

Once you have made sure that your packaging process above will include your datafiles on a pip install, you can find the absolute path to your datadir after a pip install by using the following code.

import importlib.resources

my_traversable_resource_container = importlib.resources.files("pkgexampledatabases.data")
directory_path = str(my_traversable_resource_container) + "/"
return directory_path

Get Absolute Directory Path For Any File

Similar to the above, but you place a joinpath(“yourfilename”) with the name of the file at the target directory you want to access.

import importlib.resources

filepath = ""
my_traversable_resource_container = importlib.resources.files("pkgexampledatabases.data").joinpath("MyFile.csv")
my_pathlib_context_manager = importlib.resources.as_file(my_traversable_resource_container)
with my_pathlib_context_manager as fullfilepath:
    filepath = str(fullfilepath)
    

The variable ‘filepath’ in the above example will contain the full absolute path to your file.

Data Directories

I’ve already stated this, but doubling down as it may help you troubleshoot. You must have a ‘__init__.py’ file in each data directory even if there are not python modules/code files.

Console

Don’t forget, we are using console commands to run our code. This is the entry in the pyproject.toml file that sets the console command to “rexdbp”.

[project.scripts]
rexdbp = "pkgexampledatabases:my_database_module.rexdbp"

More on console commands here.

Argparse

We’re continuing to use argparse to control our user input.

Here is a direct link to the argparse definitions for this package.

Here is more on using argparse.

Connecting Directly With SQLite

Given the file output list above, you can use the absolute filepath of the database files to directly connect and explore your data.

Here I am connecting from the bash shell, applying some sqlite3 environment settings, and then running a couple of queries.

Let’s look at the database we created inside of site-packages.

(databases) ubuntu@goodboy:~$ sqlite3 /home/ubuntu/myenvs/databases/lib/python3.10/site-packages/pkgexampledatabases/data/storemart.db
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
sqlite> .headers on
sqlite> .mode table
sqlite> .schema
CREATE TABLE productlist (item text UNIQUE,price float );
sqlite> select * from productlist;
+--------+-------+
|  item  | price |
+--------+-------+
| banana | 0.4   |
| apple  | 0.25  |
| orange | 0.2   |
| pasta  | 1.99  |
+--------+-------+
sqlite> .exit
(databases) ubuntu@goodboy:~$ 

I needed to apply the above environment settings to enable a clearer output of columns, and column names when querying with sqlite.

Further Reading

Doc for importlib.resources : https://docs.python.org/3/library/importlib.resources.html

Packaging datafiles with setuptools: https://setuptools.pypa.io/en/latest/userguide/datafiles.html

Reading and Writing to package datafiles: https://rexbytes.com/2022/11/19/python-packaging-reading-writing-to-datafiles-inside-a-package/

Using Argparse to parse user commands: https://rexbytes.com/2022/09/06/python-parsing-command-line-arguments-with-argparse/

Calling your python with console commands: https://rexbytes.com/2022/09/01/python-packages-as-callable-console-scripts/

Leave a Reply

%d bloggers like this: