mining-tools:gitdm.git
8 years agoAdded new dump to CSV
Germán Póo-Caamaño [Tue, 1 Dec 2009 02:32:19 +0000 (23:32 -0300)]
Added new dump to CSV

Two new dumps were added: for filetype and for every changeset.

8 years agoFixed CSCount which should not count merges
Germán Póo-Caamaño [Thu, 26 Nov 2009 18:04:28 +0000 (15:04 -0300)]
Fixed CSCount which should not count merges

Patches as well s Total* and Dates are counted only if the
changeset is not a merge. However, CSCount (ChangeSetCount)
was counting everything, which changes a bit the results.

8 years agoSplitted the grabpatch from the parser
Germán Póo-Caamaño [Wed, 25 Nov 2009 04:27:07 +0000 (01:27 -0300)]
Splitted the grabpatch from the parser

Created the class LogPatchSplitter which mission is only get
each commit set as lines.  The class provides an iterator which
makes easier to read the code and cleaner.

8 years agoAdded initial support for file type reports
Germán Póo-Caamaño [Wed, 25 Nov 2009 03:41:51 +0000 (00:41 -0300)]
Added initial support for file type reports

It may distinguish between code, documentation, translations, etc.
Hence, it provides the basic feature to get more accurate reports.

8 years agoAdded a function to parse the stats per file
Germán Póo-Caamaño [Tue, 24 Nov 2009 02:55:08 +0000 (23:55 -0300)]
Added a function to parse the stats per file

In order to make cleaner the code, I created a function
that parses a numstat line, which is useful to calculate
the filename changed, lines added and lines removed.

8 years agoUse a dict of patterns instead of different global variables
Germán Póo-Caamaño [Tue, 24 Nov 2009 02:17:03 +0000 (23:17 -0300)]
Use a dict of patterns instead of different global variables

The dictionary used allows a cleaner code and easier to read.

8 years agoGet the statistics from numstat instead of diff
Germán Póo-Caamaño [Thu, 19 Nov 2009 03:27:21 +0000 (00:27 -0300)]
Get the statistics from numstat instead of diff

The option --numstat gives the statistics per file and
it is not verbose as the option -p.

8 years agoUse csv package instead of the manual handling
Germán Póo-Caamaño [Sun, 15 Nov 2009 04:36:56 +0000 (01:36 -0300)]
Use csv package instead of the manual handling

8 years agoFrom: Iestyn Pryce <dylunio@gmail.com>
Jonathan Corbet [Mon, 23 Nov 2009 18:17:13 +0000 (11:17 -0700)]
From: Iestyn Pryce <dylunio@gmail.com>

Preserve spaces in the second parameter of the overall config file.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
8 years agoGet the developer count right even without full patch info
Jonathan Corbet [Fri, 24 Jul 2009 23:11:41 +0000 (17:11 -0600)]
Get the developer count right even without full patch info

8 years agogitdm: report issue when an email address is a "name"
Greg Kroah-Hartman [Fri, 24 Jul 2009 18:26:01 +0000 (11:26 -0700)]
gitdm: report issue when an email address is a "name"

This probably means an incorrect commit message, it also
means that if it is not fixed, the category for this person is probably
going to be incorrect.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
8 years agoadd more spaces to the "done" message so it doesn't show the trailing 0
Greg Kroah-Hartman [Fri, 24 Jul 2009 18:26:00 +0000 (11:26 -0700)]
add more spaces to the "done" message so it doesn't show the trailing 0

8 years agoReduce the number of "funky email" gripes
Jonathan Corbet [Fri, 24 Jul 2009 19:56:21 +0000 (13:56 -0600)]
Reduce the number of "funky email" gripes

Addresses of the form "user at host.wherever" can be trivially repaired, so
let's do so.

A couple of other minor tweaks are included here as well; nothing which
changes behavior.

9 years agoQuick hack to make the developer/employer counts at the top correct
Jonathan Corbet [Sat, 21 Mar 2009 21:29:57 +0000 (15:29 -0600)]
Quick hack to make the developer/employer counts at the top correct

...before we were counting everybody we knew about, regardless of whether
they did anything in the period we're looking at.

9 years agoMake the internal merge pattern a bit more general
Jonathan Corbet [Tue, 10 Feb 2009 22:35:18 +0000 (15:35 -0700)]
Make the internal merge pattern a bit more general

9 years agoAdd a copyright notice to treeplot
Jonathan Corbet [Tue, 10 Feb 2009 22:25:02 +0000 (15:25 -0700)]
Add a copyright notice to treeplot

9 years agoSort the output text
Jonathan Corbet [Tue, 10 Feb 2009 21:27:19 +0000 (14:27 -0700)]
Sort the output text

...also make a pseudo tree for changesets which go straight to the
mainline.

9 years agoA quick and dirty treeplot utility
Jonathan Corbet [Tue, 10 Feb 2009 20:52:58 +0000 (13:52 -0700)]
A quick and dirty treeplot utility

This is a tool to make a graphviz input file describing the patch flow into
the mainline.

9 years agoBetter email address handling
Jonathan Corbet [Thu, 13 Nov 2008 16:13:25 +0000 (09:13 -0700)]
Better email address handling

Some people quote their names in various tags:

Something-done-by: "J Random Hacker" <...>
We kept the quotes with the name, confusing things down the road.  So
change the patterns to exclude those quotes when they exist.

9 years agoTested-by / Reported-by credits and more
Jonathan Corbet [Tue, 11 Nov 2008 18:11:04 +0000 (11:11 -0700)]
Tested-by / Reported-by credits and more

Add tracking of tested-by, reported-by, and reviewed-by.  For the first
two, we also track who is *giving* those credits.

While I was in the neighborhood I also:

 - Started turning the "patch" class into something more than a bare
   container; this work has just begin.

 - Moved the report-writing code into its own file (reports.py)

9 years agoUse find() instead of index()
Jonathan Corbet [Thu, 16 Oct 2008 17:48:09 +0000 (11:48 -0600)]
Use find() instead of index()

That keeps it from crashing on seemingly malformed addresses.  Change
suggested by Dave Foster.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
9 years agogitdm patch ...
Michael Meeks [Mon, 29 Sep 2008 16:46:37 +0000 (17:46 +0100)]
gitdm patch ...

Hi guys,

I knocked up a patch to generate some per-month, by-affiliation
statistics from the gitdm output; attached for interest or merging.

A sample of the output, complete with OO.o data-pilot, and pretty chart
is here:

http://www.gnome.org/~michael/data/2008-09-29-linux-stats.ods

with chart here:
http://www.gnome.org/~michael/images/2008-09-29-kernel-active.png

caption being:

"Graph showing number and affiliation of active kernel developers
(contributing more than 100 lines per month). Quick affiliation key,
from bottom up: Unknown, No-Affiliation, IBM, RedHat, Novell, Intel ..."

These are as yet not published, I plan to use them as a comparison to
OO.o's somewhat mediocre equivalents; hope to go live with them soon
(and fix the horrible bugs in stacked area charts to make them actually
pretty ).

HTH,

Michael.

--
 michael.meeks@novell.com  <><, Pseudo Engineer, itinerant idiot

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
9 years agoDon't accept totally bogus dates
Jonathan Corbet [Fri, 5 Sep 2008 19:53:35 +0000 (13:53 -0600)]
Don't accept totally bogus dates

Yanmin Zhang committed a patch (09f2724a786f76475ef2985cf84f5359c553aade)
which claims to have been written in August, 2030.  Code that bleeding-edge
makes gitdm confused, so pretend it's just normal, contemporary stuff.

9 years agofinally get the config file stuff correct
Greg KH [Thu, 24 Jul 2008 00:24:14 +0000 (17:24 -0700)]
finally get the config file stuff correct

Need to seed the database _after_ loading the config file,
otherwise we don't see the seeds as actually showing up for their
companies.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
9 years agoparse the config file _after_ we have read the command line options
Greg KH [Thu, 24 Jul 2008 00:03:13 +0000 (17:03 -0700)]
parse the config file _after_ we have read the command line options

Otherwise it doesn't matter if we change the config file option or not...

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
9 years agomake -c option actually work
Greg Kroah-Hartman [Wed, 23 Jul 2008 23:17:53 +0000 (16:17 -0700)]
make -c option actually work

The -c option was not fully implemented

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
9 years agoFix up the copyright notices.
Jonathan Corbet [Fri, 18 Jul 2008 21:34:28 +0000 (15:34 -0600)]
Fix up the copyright notices.

9 years agoMove regular expressions out to patterns.py
Jonathan Corbet [Fri, 18 Jul 2008 21:04:55 +0000 (15:04 -0600)]
Move regular expressions out to patterns.py

...I need them for an associated tool I'm working on.

9 years agoGet rid of a debugging print statement.
Jonathan Corbet [Tue, 1 Jul 2008 18:11:43 +0000 (12:11 -0600)]
Get rid of a debugging print statement.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
9 years agoA bunch of domain map additions from Greg
Jonathan Corbet [Fri, 27 Jun 2008 15:31:19 +0000 (09:31 -0600)]
A bunch of domain map additions from Greg

9 years agogitdm: Report progress to stderr not stdout
Kir Kolyshkin [Mon, 7 Apr 2008 19:59:18 +0000 (23:59 +0400)]
gitdm: Report progress to stderr not stdout

When gitdm is used for generating text-only report with its output
redirected to a file, all is well aside from the clutter at the beginning
of that file -- a very long line with repeating "Grabbing changesets...".

Solve that by redirecting progress reporting to stderr. It also helps to
see the progress when you redirect gitdm output to a file.

Also, we don't have to flush stdout since stderr is unbuffered by default.

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
9 years agoInitial commit
Jonathan Corbet [Fri, 27 Jun 2008 14:58:35 +0000 (08:58 -0600)]
Initial commit

First commit of gitdm to the new repo.  Call it version 0.10 or something
silly like that.