PR#193: docs: recommend using PostgreSQL for development (over SQLite) - fedora-qa/blockerbugs

fedora-qa / blockerbugs

#193 docs: recommend using PostgreSQL for development (over SQLite)

Merged 2 years ago by kparal. Opened 2 years ago by kparal.

feature/postgres-by-default into develop

pull postgres:13 instead of latest

Kamil Páral • 2 years ago

0e2b3d7

rename blockerbugs container to blockerbugsdb

Kamil Páral • 2 years ago

ae524c8

docs: recommend using PostgreSQL for development (over SQLite)

Kamil Páral • 2 years ago

c068129

.gitignore

file modified

		`@@ -19,5 +19,6 @@`

		`# local config`
		`/blockerbugs_db.sqlite*`
		`+ /*.pgdump`
		`/conf/settings.py`
		`/env_blockerbugs/`

README.md

file modified

+3 -3

		`@@ -16,13 +16,13 @@`

		`## Documentation`

		`- To see complete documentation of this project, run this command in the base of`
		`- the source tree:`
		+ To see a complete documentation of this project, look into `docs/source/`
		`+ directory. You can also run this command in the project root directory:`
		```
		`make docs`
		```

		- You can then see the docs in ``./docs/_build/html/index.html``.
		+ And then you'll find the rendered docs in `docs/_build/html/index.html`.

		`## Deployment`

docs/source/development.rst

file modified

+57 -10

		`@@ -3,26 +3,65 @@`
		`====================================`

		`Setting up a development environment for the blocker tracking app is pretty`
		`- easy but different from installing the app in a production environment`
		`-`
		`- FIXME: talk about git-flow`
		+ easy but different from installing the app in a :doc:`production environment <installation>`.


		`Required Fedora Packages`
		`========================`

		`- The instructions here will use virtualenv so that there are no potential`
		`- conflicts with system libs. At a bare minimum, you will need:`
		`+ The instructions below will use virtualenv so that there are no potential conflicts with system libs. However, you'll still need at least these packages on the host system:`

		`* python3-virtualenv`
		`* python3-devel`
		`* libcurl-devel`
		`* krb5-devel`
		`* openssl-devel`
		`+ * postgresql`
		`+`
		`+`
		`+ .. _set-up-database:`
		`+`
		`+ Set up a PostgreSQL database`
		`+ ============================`
		`+`
		+ PostgreSQL should be used both for development and production. You can set it up similarly to how we :doc:`configure production <installation>`, but you'll find a more convenient development setup below using containers.
		`+`
		+ Run a `Podman`_ container with the latest `PostgreSQL image`_::
		`+`
		`+ podman run --detach --name blockerbugsdb \`
		`+ --env POSTGRES_PASSWORD=your-password \`
		`+ --env POSTGRES_DB=blockerbugs \`
		`+ --env PGDATA=/var/lib/postgresql/pgdata \`
		`+ --publish 5432:5432 \`
		`+ docker.io/library/postgres:13`
		`+`
		`+ .. note::`
		+ The container uses a non-standard path for ``PGDATA``, so that the database is stored inside the actual container (and not on a mounted volume). That allows you to later use ``podman commit`` to easily back up and experiment with database structure and contents, see :ref:`manipulating-database`.
		`+`
		`+ Wait 10 seconds to let the dabatase initialize and then check that it works::`
		`+`
		`+ $ psql --host=localhost --username=postgres --command='select version();'`
		`+ Password for user postgres:`

		`- If you plan on developing against a database other than sqlite, you will need`
		`- to have that installed and configured, similarly to the production install`
		`- documentation FIXME: NEED LINK TO PRODUCTION DOCS HERE`
		`+ version`
		`+ ---------------------------------------`
		`+ PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu ...`
		`+ (1 row)`
		`+`
		+ If you don't want to type in your password every time, `set up`__ ``~/.pgpass``.
		`+`
		`+ Hint: With Podman, you can start, stop, and list existing containers like this::`
		`+`
		`+ podman start blockerbugsdb`
		`+ podman stop blockerbugsdb`
		`+ podman ps -a`
		`+`
		`+ .. note::`
		+ We used to recommend SQLite for development, because it's trivial to set up and use. However it comes with serious disadvantages. The database upgrades (performed using :ref:`Alembic <using-alembic>`) fail for certain operations, or you might end up with a different schema. That's why SQLite is no longer recommended and all developers should use PostgreSQL instead.
		`+`
		`+ .. _Podman: https://podman.io/whatis.html`
		`+ .. _PostgreSQL image: https://hub.docker.com/_/postgres`
		`+ __ https://www.postgresql.org/docs/current/libpq-pgpass.html`


		`Create a Virtualenv`
		`@@ -54,9 +93,9 @@`
		`Configuring dev environment`
		`===========================`

		`- To generate an initial config for a sqlite database, use the following command::`
		`+ To generate an initial config using a PostgreSQL database, use the following command::`

		`- python run_cli.py generate_config -d 'sqlite:////path/to/code/blockerbugs_db.sqlite'`
		`+ python run_cli.py generate_config -d 'postgresql+psycopg2://postgres:your-password@localhost:5432/blockerbugs'`

		The configuration file will be generated in the ``conf/`` directory. While you
		would copy this config file to ``/etc/blockerbugs`` in a production system, for
		`@@ -67,6 +106,8 @@`

		`FIXME: LINK TO CONFIGURATION OPTIONS`

		`+ .. _initializing_env:`
		`+`
		`Initializing the Environment (Lazy Method)`
		`------------------------------------------`

		`@@ -154,3 +195,9 @@`

		`flake8`
		`mypy`
		`+`
		`+`
		`+ More info on development`
		`+ ========================`
		`+`
		+ In order to learn more about regular situations encountered by a developer, please see :doc:`development_tasks`.

docs/source/development_tasks.rst

file modified

+77 -2

		`@@ -27,6 +27,75 @@`
		`do not have write privileges, this will be done for you`


		`+ .. _manipulating-database:`
		`+`
		`+ Viewing and manipulating the database`
		`+ =====================================`
		`+`
		`+ If you need to perform some changes to the database content or structure (e.g. when experimenting with new untested syncing code, or when writing Alembic scripts), there are several ways to proceed.`
		`+`
		`+ pg_dump`
		`+ -------`
		`+`
		+ You can use ``pg_dump`` or ``pg_dumpall`` to back up the whole database, and ``pg_restore`` or ``psql -f`` to restore the database (see `this tutorial`__).
		`+`
		`+ __ https://snapshooter.com/learn/postgresql/pg_dump_pg_restore`
		`+`
		`+ Here's an example of a backup and restore::`
		`+`
		`+ pg_dump --host=localhost --username=postgres --format=custom --file=blockerbugs.pgdump blockerbugs`
		`+ # the following will completely drop and recreate the blockerbugs db`
		`+ pg_restore --host=localhost --username=postgres --dbname=postgres --clean --create blockerbugs.pgdump`
		`+`
		`+`
		`+ podman commit`
		`+ -------------`
		`+`
		+ Alternatively, you can also use ``podman commit`` to save the current container as a new image. If something goes wrong, you then remove the existing container and create a new one based on the saved image (with the backed-up database contents). For example::
		`+`
		`+ podman stop blockerbugsdb`
		`+ podman commit blockerbugsdb localhost/blockerbugsdb-backup`
		+ # You can now find the backup image in `podman images`.
		+ # You can remove it any time with `podman image rm`.
		`+ # Feel free to continue development with your current database:`
		`+ podman start blockerbugsdb`
		`+`
		`+ # If you break your database, restore it from the backup:`
		`+ podman rm --force blockerbugsdb`
		`+ podman run --detach --name blockerbugsdb --publish 5432:5432 localhost/blockerbugsdb-backup`
		`+`
		+ Please note that once you have a container based on your ``localhost/blockerbugsdb-backup`` image, you can't remove that image. If that bothers you, you can (after all is done) copy out the database files out of the current container, create a :ref:`new container from scratch <set-up-database>` based on the upstream image, and copy in those files::
		`+`
		`+ podman stop blockerbugsdb`
		`+ mkdir pgdata`
		`+ # the /. is important in these commands`
		`+ podman container cp blockerbugsdb:/var/lib/postgresql/pgdata/. pgdata/`
		`+ # re-create the container now (see above), and stop it`
		`+ podman container cp pgdata/. blockerbugsdb:/var/lib/postgresql/pgdata/`
		`+`
		`+ Now you're free to remove your backup image.`
		`+`
		`+`
		`+ GUI`
		`+ ---`
		`+`
		+ If you want to display and optionally edit/backup/restore the tables in a GUI client, there's `pgadmin3`_ in Fedora repositories, `Sequeler`_ on Flathub (viewing only), and `many other clients available`__.
		`+`
		`+ .. _pgadmin3: https://src.fedoraproject.org/rpms/pgadmin3`
		`+ .. _Sequeler: https://flathub.org/apps/details/com.github.alecaddd.sequeler`
		`+ __ https://wiki.postgresql.org/wiki/PostgreSQL_Clients#Open_Source`
		`+`
		`+`
		`+ Resetting the database`
		`+ ----------------------`
		`+`
		+ If you want to simply reset the blockerbugs database into the default empty state (i.e. no tables), instead of restoring from a backup, you can either a) remove the current container and create :ref:`a new one from scratch <set-up-database>`, or b) you can even faster drop the existing database and create a new empty one like this::
		`+`
		`+ dropdb --host=localhost --username=postgres blockerbugs`
		`+ createdb --host=localhost --username=postgres blockerbugs`
		`+`
		+ You can then continue with :ref:`initializing the environment <initializing_env>`.
		`+`
		`Release Process`
		`===============`

		`@@ -114,6 +183,9 @@`

		`make kojibuild`

		`+`
		`+ .. _using-alembic:`
		`+`
		`Using Alembic`
		`=============`

		`@@ -121,6 +193,9 @@`
		`lightweight database migration tool for usage with the SQLAlchemy Database`
		`Toolkit for Python.' and used to manage the database schema used with the application.`

		`+ .. warning::`
		+ Always use a PostgreSQL database when creating and testing an Alembic script. That's the same database we use for :doc:`production <installation>`. The autogeneration code will differ when created on different databases, that's why PostgreSQL must be used. Certain operations also behave differently. In particular, an SQLite database will fail for most table altering operations (like changing or dropping a column).
		`+`
		`To install alembic, use (inside the virtualenv)::`

		`pip install alembic`
		`@@ -141,8 +216,8 @@`

		`alembic revision -m "Create a table"`

		`- This will create a file called, e.g., 42d71a06dd50_create_a_table.py in`
		`- alembic/versions. The file contains upgrade and downgrade methods that need`
		+ This will create a file called, e.g., ``42d71a06dd50_create_a_table.py`` in
		+ ``alembic/versions/``. The file contains upgrade and downgrade methods that need
		`to be adjusted to reflect desired migration.`

		`In many cases, alembic is able to determine the steps needed for upgrade and`

docs/source/installation.rst

file modified

+3 -15

		`@@ -3,13 +3,13 @@`
		`============`

		`The recommended method for installing the blocker tracking app is to use an RPM`
		`- that you've built from source (see devel tasks) or released by the developers.`
		+ that you've built from source (see :doc:`development_tasks`) or released by the developers.
		`At this time, the application is not packaged in the main Fedora repositories`
		`and at this time, there are no concrete plans to change that.`

		`.. Note::`
		`These instructions assume that you have already done basic setup like opening`
		`- any appropriate firewall ports, already have postgresql-server installed,`
		+ any appropriate firewall ports, already have ``postgresql-server`` installed,
		`initialized and running as a service.`

		`Installing Packages`
		`@@ -20,9 +20,7 @@`

		`sudo yum install blockerbugs-$VERSION-$RELEASE.noarch.el6.rpm`

		`- While the blocker tracking app should work with any database engine supported by`
		`- SQLAlchemy, the main production instance uses PostgreSQL and our development`
		`- instances are generally using SQLite3. YMMV with other database engines.`
		`+ While the blocker tracking app should work with any database engine supported by SQLAlchemy, the main production instance uses PostgreSQL and that's what we recommend even for development environments. YMMV with other database engines.`

		`Setting up the Database`
		`=======================`
		`@@ -40,16 +38,6 @@`
		`sudo service postgresql start`
		`sudo chkconfig enable postgresql`

		`- Start and enable the postgres service::`
		`-`
		`- sudo service postgresql start`
		`- sudo chkconfig enable postgresql`
		`-`
		`- Start and enable the postgres service::`
		`-`
		`- sudo service postgresql start`
		`- sudo chkconfig enable postgresql`
		`-`
		`.. Note::`
		`This isn't fully finished yet, it is assumed that you know how to setup postgres.`

requirements.txt

file modified

		`@@ -14,6 +14,7 @@`
		`koji`
		`mock`
		`munch`
		`+ psycopg2-binary`
jskladan commented 2 years ago @frantisekz - I'm not sure about the build process now, don't we use requirements.txt to generate the specfile deps now? If so, how does it fare with the `psycopg2-binary`?
		`py`
		`pycurl`
		`pytest-cov`

kparal commented 2 years ago

This changes all documentation to discourage people from using SQLite (even for
development) and recommends to use PostgreSQL instead. The docs are updated to
include setup instructions for PostgreSQL as a podman container, and adds
guidance how to control it, back it up, etc.

The motivation for this change is that I encountered serious issues when
adjusting DB scheme (including writing Alembic scripts) with an SQLite DB.
Operations like editing a column, removing a column, or changing constraints
fail with SQLite. Some of those issues can be worked around with a "batch" mode
of Alembic, some can't. Even worse, the Alembic scripts' autogeneration seems to
be related to the currently running DB. So even if you make sure something works
locally, it might not work in production. It is therefore important to develop
these changes with the same DB as in production. Furthermore, database upgrades
(executed automatically when detected) might fail on SQLite, which means the
developer needs to wipe away whole DB and create it from scratch.

It would be possible to keep using SQLite for development where DB schema is not
changed, and add some hints when DB upgrades fail. However, with simple enough
instructions for installing and managing PostgreSQL, it seemed unnecessary to
keep officially supporting both options. Having just a single option makes
things simpler and safer, and the new documentation should help people not
familiar with PostgreSQL.

zuul commented 2 years ago

Build succeeded.

blockerbugs-pytest : SUCCESS in 1m 41s

kparal commented 2 years ago

I'd appreciate if people with PostgreSQL/Podman experience could tell me if the instructions make sense. Of course all commands are tested and working, but perhaps there's a better way to do things? Thanks.
@jskladan @lbrabec @frantisekz

lbrabec commented 2 years ago

Thanks, this is great.

Podman related stuff looks OK although using long params seems little excessive to me...

One small nitpick: I would name the container something like "blockerbugsdb" instead of "blockerbugs".

kparal commented 2 years ago

Podman related stuff looks OK although using long params seems little excessive to me...

I agree, yet it's intentional. In documentation I believe it's better to use the long params, because they are much more obvious, and readers don't need to immediately open the man page just to figure out what -U or -d means. If people use it regularly, I expect them to using the short params if they want. For copy&paste, it doesn't really matter. So overall this approach seems better to me.

One small nitpick: I would name the container something like "blockerbugsdb" instead of "blockerbugs".

That's a good call. I'll rename it.

1 new commit added

rename blockerbugs container to blockerbugsdb

2 years ago

zuul commented 2 years ago

Build succeeded.

blockerbugs-pytest : SUCCESS in 2m 02s

jskladan commented on line 40 of docs/source/development.rst 2 years ago

I thought the :latest is a default (maybe just for docker?) value, thus excessive here. Not that it's any issue.

jskladan commented on line 5 of requirements.txt 2 years ago

@frantisekz - I'm not sure about the build process now, don't we use requirements.txt to generate the specfile deps now? If so, how does it fare with the psycopg2-binary?

kparal commented 2 years ago

I thought the :latest is a default (maybe just for docker?) value, thus excessive here. Not that it's any issue.

After reading these:
https://vsupalov.com/docker-latest-tag/
https://medium.com/@mccode/the-misunderstood-docker-tag-latest-af3babfd6375
perhaps I should replace :latest with an actual version (in this case :13)? And update that doc when PostgreSQL 14 is out and we want to switch to it? Or I can mirror the version in production in these docs (that's a bit more maintenance, though)? Thoughts?

kparal commented 2 years ago

@frantisekz - I'm not sure about the build process now, don't we use requirements.txt to generate the specfile deps now? If so, how does it fare with the psycopg2-binary?

I haven't found any code which would do a spec file generation. I believe many of the pypi projects wouldn't match the rpm names anyway, so a translation table would be needed.

frantisekz commented 2 years ago

@frantisekz - I'm not sure about the build process now, don't we use requirements.txt to generate the specfile deps now? If so, how does it fare with the psycopg2-binary?

I haven't found any code which would do a spec file generation. I believe many of the pypi projects wouldn't match the rpm names anyway, so a translation table would be needed.

The pypi > rpm mapping is handled already by python dependency generators in Fedora for us. That's not a issue. (We probably should just purge all the Requires from the spec/adjust setup.py if needed)

Dependencies from setup.py are used in the rpm generation ( https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/#Automatically-generated-dependencies ), adding pscyopg2-binary to requirements.txt wouldn't break this.

1 new commit added