#11970 Create hosting space for new matrix bot for assisting with fedora virtual matrix events
Closed: Fixed with Explanation 7 months ago by kevin. Opened 7 months ago by moralcode.

NOTE

Before creating request for Communishift namespace read our
Communishift User Guide

Communishift namespace creation


Project Name: eventbot

Project Administrators: moralcode

Requires Persistent Storage?: yes

How much space do you need?: probably under 1 GB, but idk, the 5GB default should be okay

How is this project related to Fedora?


I am currently working on further automating the workflow that I started during the F40 release party for automating the (then somewhat manual) process of getting pretix attendees into the matrix room for the event. See https://gitlab.com/fedora/commops/interns/-/issues/16#top

There is a strong interest in using this same workflow for the Fedora Week of Diversity virtual event in a few weeks.

NOTE: This project may involve the temporary processing (and possible - but avoidable - storage) of the Personally identifiable Information of event attendees for the duration of the event. I understand that this violates the policy on the use of communishift as-written below and would like to start a discussion with someone regarding how to either a) ensure that this project can use communishift in a compliant way (such as by purging all PII for an event after the event is over such that the only PII that remains is in existing platforms such as pretix), or b) finding another hosting platform to use


By submitting this ticket you agree to you have read and understood
https://docs.fedoraproject.org/en-US/infra/communishift/#_service_usage_requirements


Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: low-gain, low-trouble, ops

7 months ago

I endorse this request!

So, the reason we wanted to forbid handling PII on these instances is to avoid the entaglements that come with that, ie:
What do we do if someone requests a GDPR delete? How do we know if they have informaiton in there and who is responsible to delete it?
Same for reporting. Same for if the data is leaked. :(

Could it be possible that your app just refers to data thats in say the account system? Or does it need to store additional PII?
(I'm thinking of some kind of hash that just describes attendees via a fasjson link/etc. Then, the only thing stored is public info).

Could it be possible that your app just refers to data thats in say the account system? Or does it need to store additional PII?
(I'm thinking of some kind of hash that just describes attendees via a fasjson link/etc. Then, the only thing stored is public info).

Not every user who signs up for a fedora event in pretix is going to have a FAS account. it is possible to make that a requirement and adapt the bot accordingly, but that may introduce additional friction to the registration process that may affect event attendance by new or casual contributors to the project.

I'm unfamiliar with fasjson so I'm not sure about if/how that could specifically be used

What do we do if someone requests a GDPR delete?
see "compliance with right to be forgotten" below

How do we know if they have informaiton in there and who is responsible to delete it?

see "compliance with right of access" below.

who is responsible (compliance, reporting, and deletion): because the bot is designed/intended to not keep information longer than necessary ( i.e. after the event the information is or will be automatically deleted), i guess the technically correct answer is that the bot is responsible for the process of deleting the information (automating it helps ensure the deletion is consistently performed, rather than relying on humans whose internships may end or whatever - see https://gitlab.com/fedora/commops/interns/-/issues/16?work_item_iid=61). Because a machine cant be legally responsible, it would probably fall on the developer (me) or whoever the current maintainer is. failing that, probably whoever is ultimately responsible for the event being run as ultimately they would have had to be somewhat involved in the decision to use the bot in the first place.


some additional stuff that may help

source code/transparency
the code for the bot i plan to run is being developed at https://github.com/MoralCode/maubot-events. theres also a tracking issue here for progress updates on planned features. That said, here is the intended behavior of this bot:

Summary of what the bot does:
1. host a matrix bot (it is patterned off of the same kind of code that runs zodbot and meetbot in various fedora matrix spaces)
2. connect to the pretix.eu ticketing platform to authenticate and query attendees for a given event (either using their REST API or via webhooks sent to this bot via an internet-reachable domain name)
3. pull down data about event registrations (so data in transit will likely contain PII)
4. filter this data down to only what is needed (matrix username submitted by the attendee during registration, and a unique, but not personally identifiable without other information, order ID from pretix. this is all still in transit)
5. filter out registrations where the person has already been invited (i plan to use either an in-memory or on-disk list of previously processed order id's, so whether that is over the line of PII storage or not may be a gray area)
6. perform some validation of matrix ID's to filter out and report obviously-incorrect values
7. report the failure to invite users with obviously incorrect matrix IDs to some authority figure (either the bots admin/hostmaster, or event organizers) so they can manually look up the order ID in pretix to see the values that caused it to fail, and possibly email the user (using data already in pretix) to get better data.
8. invite the users with valid matrix IDs to the matrix room for the event
9. generate and export/persist aggregate, non personally identifiable summary statistics after the event for total registrations, total invitations, and total accepted invitations
10. purge any stored data after the event is over (can also include the revocation of API credentials).

My personal GDPR assessment (not a lawyer, and maybe not fully familiar with the details of GDPR):
What information: The bot technically has access to (via the pretix API) anything that was entered into the Pretix platform, but only needs the value that the user input for their Matrix ID question during registration to operate. The Order ID (unique/random 5-ish character alpha string) from the registration is also used for handling edge cases {2}. Much of the concern depends on whether the order ID should be considered personal information. I would also be curious if anything from my side as the developer can be considered personal information, such as API keys or matrix ID's of event organizers that are included in configuration files for security and access control.

Legitimate basis: Essentially personal information (i.e. matrix ID) is only needed during a narrow time window surrounding an event {1} that the user signed up to attend (so there's probably a legitimate reason to be using it under GDPR).

Data Storage and use this information may be processed ( being actively used by the program, in transit and/or in memory) or stored ( in memory or persisted to disk - such as the list of already-processed order IDs, or system logs).

Compliance with right to be forgotten - Shutting down the instance should clear any information being held in system RAM. deleting any persisted files (logs or the list of processed order IDs) should clear the rest of it. if you want to be absolutely sure all personal info from anyone is gone, the config files can be deleted too. This can be done during the event (may be minimally disruptive to the event by requiring organizers to reauth with pretix), or for short events it could be possible to wait until the event is over if thats within the time window GDPR allows for a response to a request.

compliance with right of access if the information in this ticket (this description, the source code linked above, etc) isnt enough to respond to a request, i could also add a command to the bot to return information that it has for a given order ID or matrix user, but it very likely wont be much information at all.

Sorry i got a bit carried away so that was probably way more than was necessary but i hope all that info helps inform the decision for how to do this in a compliant way

{1} between when event registration opens and within a day or two of the event ending
{2} the orderID is used to allow a human to locate that order in pretix and review the data and/or contact the user in case they submitted a matrix ID that was not valid. this is all done outside of the bot and probably falls under the GDPR compliance of the pretix platform

@moralcode is working with me so I can get a better understanding of the data, how it flows, and if anything is stored.

After discussing this with some CPE team members and getting more info from @jflory7 and @moralcode it was found this may run in Communishift if @moralcode decides to pursue the option.

Sounds good! i will likely be pursuing this option as I work on getting this bot merged into the rest of the fedora bot infrastructure.

Also, I have included a (public revision of) a diagram from my discussion with Steve in the README of the bot's repo (https://github.com/fedora-infra/maubot-pretix-invite) in case that is helpful for evaluating the data flow of this bot

Metadata Update from @kevin:
- Issue assigned to kevin

7 months ago

I've created the fas group ( communishift-eventbot ) and added the project to communishift.

Please let us know if you need anything further.

Metadata Update from @kevin:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

7 months ago

Thanks @smilner @kevin for helping sort this one out! Much appreciated! :muscle:

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog