#114 Use metrics dashboard to visualize fedmsg data
Opened 2 years ago by skamath. Modified 4 months ago

This was initially planned by @bex. In the last meeting, we discussed the possibility of moving our metrics (which is decentralized) right now to one single dashboard.

The idea is to use the GrimoireLab tools and write wrappers around it for fedmsg, Pagure and other data sources which we might require.


Metadata Update from @skamath:
- Issue assigned to skamath
- Issue tagged with: meeting, metrics, needs feedback

2 years ago

Metadata Update from @skamath:
- Issue untagged with: meeting

a year ago

@skamath Can you update this ticket with discussions on this from FLOCK?

I'm going to quote the bit about Grimoire Labs from the CommOps workshop report article.

What is Grimoire Lab?

Sachin is leading progress on a new tool for CommOps called Grimoire Lab. Grimoire Labs is a visual dashboard tool that lets a user create charts, graphs, and visual measurements from a common data source. The vision for Grimoire Lab in Fedora is to build an interactive dashboard based off of fedmsg data. Using the data, anyone could create different gauges and measurements in an easy-to-understand chart or graph. This helps make the fedmsg data more accessible for others in the project to use, without making them write their own code to create graphic measurements.

Most of the time for Grimoire Lab in the workshop was explaining its purpose and expected use. Sachin explained some of the progress made so far to make the tool available in Fedora. This goal is to get it hosted inside of Fedora's infrastructure next. We hope to deliver on an early preview of this over the next year.

Metadata Update from @jflory7:
- Issue priority set to: no deadline
- Issue set to the milestone: Future releases

a year ago

Metadata Update from @jflory7:
- Issue tagged with: meeting

a year ago

Discussed in 2017-10-30 meeting.

Choosing the tool

On our Flock 2017 article, we received a comment advising us against using Grimoire Labs. While tooling isn't the most important part of this process, it seemed like valid feedback to warrant more research on what options work best for our needs.

One alternative option could be Grafana. We need to research the different pros and cons to both Grafana and Grimoire Labs. I plan to do this research and leave a comment in this ticket within two weeks with the comparison. @skamath also had a concern of a potential blocker with Grimoire Labs, and will comment with more details on that.

CHAOSS

@bex also mentioned the Community Health Analytics Open Source Software project, or CHAOSS. This project aims to bring standards and uniformity to how we collect and use various data and metrics tools. It would be a good reference to see if we can align our work to these standards. The Linux Foundation wrote a blog post about it here.

This will also be a part of our research for picking the right tool.

FAD work

As a sub-note, this ticket would be a good one to break into different milestones, or sub-tickets. This could be a great task to have as a goal for completion / first implementation during a FAD ( #125 ).

Metadata Update from @jflory7:
- Issue untagged with: needs feedback
- Issue priority set to: minor (3-4 weeks) (was: no deadline)
- Issue set to the milestone: Fedora 28 (to May 2018) (was: Future releases)

a year ago

Discussed in 2017-11-06 meeting.

New feedback from Grimoire devs

A lead developer of Grimoire Labs commented on our blog post and left more food for thought. I may reach out to him to get feedback / advice on our situation and how Grimoire Labs may work well for us.

Research pending

This ticket is blocked by research. I'm due for a comparison between the two platforms (Grimoire / Grafana) and @skamath is due to detail the possible blocker with Grimoire Labs. We hope to have some preliminary research by next week.

GrimoireCon

@bex also noted that GrimoireCon, the annual conference for Grimoire Labs, is on Feb. 4th in Brussels (conveniently close to FOSDEM). Something to consider for #125.

Alright, so I got a chance to dig a bit deeper into Grimoire toolset last week and I think it can solve most of the problems we have at the moment. I was listing down a few pros and cons and we can discuss 'bout this later.

PROS

  • All things Python!
  • Awesome set of tools - each with it's own purpose and docs.
  • Allows cross origin user identification. Different user name on a totally different server? No problem. sortinghat has got you covered.
  • Support for mailman, github, and a lot of others tools out of the box.
  • Caches all data pulled.
  • Backed by Bitergia, a parent company who provides metrics services to corporates. Bug fixes on Grimoire should be pretty quick!
  • Plugin based approach so fedmsg can be "plugged into" the existing system.

CONS

  • Encourages usage of Kibiter, which is a soft fork of Kibana. (The lead developer of Grimoire Lab has made it clear that vanilla Kibana can be used)
  • Any large change (unlikely, put possible) can put Kibiter in a bad shape.

Apart from the above mentioned points, I don't see any problem with Grimoire toolset. It has got great set of tools and the community is small, but really active. I don't have any objections going ahead with it :)

I feel like I wrote this somewhere, but I don't remember where ...

I'd like to see us do something with a view/query engine soon because we won't really know what we want until we start to see graphs and ask questions. I don't see us as making a commitment for the ages, just for today.

What can we do to move this forward?

Discussed in 2017-11-20 meeting.

Summary

Briefly discussed @skamath's pros/cons, determined to cross-compare his list to Grafana, make a final decision by next Monday to begin pressing forward with Infrastructure

Pending research

@skamath's research covers the pros and cons from Grimoire, but doesn't look at Grafana. We were going to look once more at Grafana to figure out its strengths and weaknesses from Grimoire, and then make a decision. I am actioned to do this research during our hack session tonight.

The research comparison will go into this ticket.

Final decision

At our next meeting, we will evaluate the research and determine which platform to pursue based on our needs. @skamath noted a Perceval plugin is needed for Grimoire and fedmsg, if we go that route. Ideally, we will open a conversation with Infrastructure about this after next week's meeting.

Discussed in 2017-11-27 meeting.

Pending research

I'm overdue for research on the Grafana vs. Grimoire comparison. I'm working on this now during the hack session.

Dashboard story brainstorming

Since the technical work was blocked, we discussed ideas for what kind of questions we want to answer with the metrics dashboard. We tried to come up with ideas for what we would want to implement once the dashboard is set up. We can consider now what queries are needed to generate some of this data.

  • Active contributors by: FAS group, country, last_seen
  • Ambassadors FAS group membership mapped over time to number of badges (to ask, if there is correlation)
  • Cross core contributors: Users who are core members of multiple FAS groups
  • Measure of time between earning "beginner" badges and an authentic "contribution" badge
  • Measuring the "distance" between FAS groups: Are contributors active in different FAS groups?
    • Are our sub-projects isolated groups or large mass that cross-polinates?
  • Average time for response on mailing list
  • How long it takes a new / beginner contributor to become a core contributor the fastest or slowest (compared against different FAS groups)
  • "Do we go back or not?": Measuring users who get an event badge against activity in the project (are people we engage with getting into our community after the event?)

See I contributed! Now what? as a reference to understand the life cycle of a contributor.

We will use Grimoire

In our hack session today, we decided that Grimoire is the best fit for our needs because of its support for various data sources used in Fedora and it has a clear pathway to add fedmsg support.

The Grimoire website has a section where it talks about its data sources. Perceval is the platform we'd write a fedmsg integration plugin for. @skamath suggested we first develop the fedmsg plugin before requesting a hosted instance of Grimoire from Fedora Infrastructure.

CommOps server as devel environment

In GSoC 2016, Infrastructure granted CommOps a Fedora server to use for fedmsg / data projects. We could use this machine as a "staging" instance for Grimoire while we work on the fedmsg integration plugin.

I'm working on updating the machine and we hope to have an install of Grimoire in the next week or two. The actual installation is low priority.

In the next meeting, we can discuss the logistics of the plugin in depth and create a timeline for working on it and possibly add it to the FAD agenda.

Discussed in 2017-12-04 meeting.

Estimating a timeline

I asked @skamath to brainstorm a rough timeline for the Perceval fedmsg plugin. A timeline to understand the work is helpful for making accurate estimates about how much work is required and estimating how long it will take.

  • What is required?
  • About how much work is involved?
  • Do you foresee any possible blockers?

CommOps cloud machine

I upgraded the machine to Fedora 27, but have not started playing with Grimoire. It's on the list, but it's also low priority since we don't have anything to test yet either.

Discussed during 2017-12-11 meeting.

Estimating a timeline

@skamath was blocked by final exams and plans to work on this one this week. (Also, congrats again Sachin on completing your undergraduate program! :smile: )

Sorry for the back-to-back comments. During the FAD discussion, we talked about dashboard use cases, and it felt more relevant for this ticket than #125.

What problem does this solve?

Visualizing fedmsg data in a dashboard allows us to create charts, tables, and other visual aids to understand data in fedmsg. To use it now, you have to write code / scripts to interact with the data. The dashboard lets anyone, regardless of tech knowledge, see easy-to-understand visualizations about the health of their project / sub-project in Fedora.

For example, a maintainer of a large project (e.g. Pagure) might see the number of commits over time, the number of new contributors in a month, etc. The localization team may have a geographic chart for languages, so if a new language suddenly starts receiving many translations, the community could reach out and support the new translators in their efforts.

(Thanks @meskarune for helping us frame our discussion around this question first.)

Use cases

By visualizing fedmsg data in a dashboard with charts, graphs, and other visual aids, some things we would want to do are…

  • Figuring out what times (days / months_ people contribute the most with bar graphs
    • Planning internal Fedora team sprints when more people are contributing
  • Evaluate event success if new contributors enter the community after an event is completed
  • Understand new contributors better and where they snap off (if they do)
    • The Epic Journey of the New Fedora Contributor (i.e. using the visualizations to see pathways / successes / failures of new contributors entering the community for the first time)

We're aware the Fedora Hubs team conducted a lot of research on these topics, and combining their research into our Grimoire implementation is helpful (and doesn't duplicate over a year's worth of work already done).

@jflory7 Thank you for summing up all the points! I need some minions too :stuck_out_tongue:

For constant updates and planning, I have created an etherpad to prevent constant e-mail notifications. I'll keep updating it as I complete the tasks and I encourage all of you working on it to do the same :)

Link to etherpad : http://etherpad.osuosl.org/commops-metrics-dashboard

Discussed in 2017-12-18 meeting.

Exploring plugin development

@skamath set up a local dev environment and is reading training manuals. There are no blockers as of now. As he explores plugins more, he will update the ticket with his findings.

Removing from meeting agenda for now – @skamath will re-add it when feedback is needed or we need to discuss as a group.

Metadata Update from @jflory7:
- Issue untagged with: meeting
- Issue priority set to: normal (1-2 weeks) (was: minor (3-4 weeks))

a year ago

Metadata Update from @jflory7:
- Issue unmarked as blocking: #42

a year ago

Discussed in CommOps 2018 FAD.

Quick recap

We want to collect metrics to understand the Fedora community better and to provide the community a tool for data analysis.

Deliverable

One-year plan to develop and implement GrimoireLabs dashboard; create timeline / basic goals to do this

Metrics collection

We started by creating three types of target areas for metrics we want to collect:

  • Users
  • Teams / community health
  • Fedoraland

We wrote questions for each target area and aggregated them by category.

Users

  • Activity stats
    • Activity in different tools in Fedora
    • Issues raised: Pagure / GitHub / Bugzilla
    • Individual ratio: Pagure, Koji, git builds
    • Is the $user community or technical?
    • If the $user is ambassadors, what do they do in a year? (contribute to other teams too? or just events)
  • User patterns / paths
    • What are $user contributing patterns across different areas of Fedora?
      • Which areas do people start with and how do they move forward? eg. started with wiki, then git
      • Are technical contributors also engaging in community tasks and vica versa?
      • Involved in different teams? Activity in different teams?
    • When do I contribute the most? Dropout/ Burnout?
    • How many people contribute for each month of the year? Is there a trend?
    • How active is the $user throughout the year?
    • Mattdm's user graph
    • Identify times when we gain new contributors and trends in contributor growth

Teams / community health

  • Activity
    • Fedora contribution types (wiki, commit, etc.) for different teams
    • What team has a sudden increase in participation? Newcomer onboarding and retention
    • (Community Engagement) Reply speed (mail, commit / issue) by Pagure, Bugzilla, topics?
    • Impact of objectives:
      • Since an Objective's start, is there an increase or decrease in fedmsg activity/team members/engagement/newcomers for the affected team?
    • Activity (mailing list or otherwise) by day of week and time of day in each contributor's local timezone
  • Interaction
    • Most used words on IRC? Mailing lists? (wordcloud)
    • Mailing list, IRC, commit volume for team health
    • People who start discussions
      • Identify the influencers (contributors who are expert in area and guide others) vs. learner (newcomer, seeks information)
    • Do active users dominate a mailing list?
    • Is a mailing list "healthy" for engaging contributors?
      • Single posts with no replies (assuming we've filtered out automated reports) are discouraging. Megathreads can indicate community passion and big issues, but can also be indicative of communication problems.
      • See mattdm's ticket
  • Regions
    • Community by regions / country (APAC, LATAM)
    • Ambassador by regions and in which regions do event occur
  • Newcomers
    • To what ratio does a specific team have contributors that are… and the evolution over last few years
      • Long-time
      • Intermediate
      • Newbies
      • Newcomer stats (joins, leaves) What happens to contributors that first appear in a team? Do they stay?
    • FAS stats
    • What activities have successful on-boarding paths? (e.g. people come to mailing lists and drop off, but IRC meetings bring long-term contributors?)
    • Newcomer onboarding by region
  • Misc.
    • (unicorn) All the mattdm Flock metrics
    • How well is a $fas_group performing?
    • GSoC-specific weekly stats

Fedoraland

  • (storytelling) Magazine / CommBlog / social media
    • Magazine stats - - what does our community want to hear about? engage with and by region?
    • Community Blog - what does our community want to hear about? engage with and by region?
    • Social media metrics (activity, newcomers, engagement, best discussions)
  • Elections
    • Did the last election have more or less participants?
    • Election voting by different parts of the project? Region?
  • Ask Fedora
    • What questions do they ask by stats / tags?
    • Blocked questions statistics - which questions are stuck (no answers)? any particular tags?
  • Events
    • During a Fedora event (Flock, Ambassadors, etc.), do we see new contributors?
    • How do events affect existing contributors?
    • Overlay graphs with events happening at certain times, key things (not just conferences) may have occurred, etc.
  • Badges
    • What badges are the most popular? How many people have them?
    • Badge stats for fun

Tooling

  • Grimoire
    • Kidash packaging?
    • New features:
      • FAS
      • fedmsg
      • Pagure
      • Social media: Twitter, FB, G+, etc.
    • Unblocking Grimoire dashboard in Fedora
      • Sending people to GrimoireCon
      • Perceval plugin to collect the data we want
      • Finding where to put this (talk to infra)
      • Get a mock dashboard hoste, to expirement
    • GrimoireCon:
      • Plans to package as RPM?
      • Analyzing Docker potential to deploy in OpenShift
  • Happiness Packets
    • Make a container to contain the application
    • Verify no other requirements other than Postgres and scope that
    • Document needed environment variables to pass into container for start up
    • Consider update strategies like FLIBS or CentOS Container Pipeline

Metadata Update from @jflory7:
- Issue tagged with: FAD

a year ago

This came up at GrimoireCon, but a super cool Kibana plugin demoed by one of Bitergia's staff members: Network plugin for Kibana 5

Metadata Update from @jflory7:
- Issue tagged with: meeting

a year ago

Discussed in 2018-02-19 meeting.

Documenting usage

One of the next major milestones for Kibana / Grimoire is documenting how to use, log into, and manage the Grimoire dashboard on the CommOps cloud machine. During our meeting, @skamath shared a blocker that limits how we can share login information publicly or not.

User permissions and Guard

We want to share a public, read-only account in our documentation and then manage "admin"-type users to build the dashboards, widgets, etc. This would be CommOps members or other Fedora contributors with a strong interest in metrics / data collection.

@skamath mentioned a Kibana plugin called "Shield" developed by the Elasticsearch team. This was a possible option to solve the advanced permissioning issue for multiple users. @skamath plans to research this (and possibly implement it, depending) by next Monday, Feb. 26, 2018.

This blocks documentation until we figure out how to share login credentials with the public.

Timeline

This is mostly keeping us to our timeline discussed in the FAD for February. We may slip on documentation, but I think all else considered, we are doing well.

commops-feb-timeline.png

Discussed in 2018-03-19 meeting.

We are pending an update on importing a new data feed type into Grimoire. I'm following up with @skamath for an update for our milestone planned at the FAD.

Metadata Update from @jflory7:
- Issue untagged with: meeting
- Issue priority set to: minor (3-4 weeks) (was: normal (1-2 weeks))
- Issue tagged with: blocked

11 months ago

Metadata Update from @jflory7:
- Issue set to the milestone: Future releases (was: Fedora 28 (to May 2018))

11 months ago

Metadata Update from @jflory7:
- Issue tagged with: team - commops

7 months ago

Metadata Update from @jflory7:
- Assignee reset
- Issue priority set to: no deadline (was: minor (3-4 weeks))

5 months ago

In IRC today, @algogator was interested in working on the Perceval fedmsg plugin.

@algogator Let us know how we can help you get started and do awesome things! I'm not sure where you are ready to begin since you worked with @skamath to get caught up.

Metadata Update from @jflory7:
- Issue priority set to: next meeting (was: waiting on external)

5 months ago

Metadata Update from @jflory7:
- Issue priority set to: waiting on external (was: next meeting)

5 months ago

Metadata Update from @jflory7:
- Issue untagged with: blocked
- Issue assigned to algogator
- Issue priority set to: waiting on assignee (was: waiting on external)

5 months ago

Metadata Update from @jflory7:
- Issue untagged with: type - FAD
- Issue tagged with: type - coding

5 months ago

@algogator I am also really interested in this. How does it look?

I didn't have a lot of time to work on this the past few months and I don't think I can till December

Login to comment on this ticket.