Community
Participate
Working Groups
Cron job to nag the project leads (as found in the foundation database) to update their eclipse-project-info.xml file at least once per quarter
Any specific rules that need to be followed besides this being a quarterly job?
It needs to check that various sections that should be different each quarter are different each quarter. The question, then, is what sections should be different each quarter? <summary> needs to be different <releases> needs to be different (some milestones should have been achieved)
This program should run once a week. The correctness rules should be in a separate PHP object so that they are easy to add to and modify. This program should use the ProjectInfo object. Actually, it should enhance that object as ExtendedProjectInfo so that it can keep track of where various bits of data were found. The up-to-date/out-of-date reminder will come from a rule that will compare the <summary> with a copy of the summary from three months ago and if they are the same, include a reminder in the email to the project lead. Other reminders will include: * missing sections * sections that are defined in old files, such as using .dates.txt instead of project-info.xml or using a project-page-paragraphs.html in the root rather than in /project-info/ or etc. Each reminder must include the exact steps necessary to correct the problem. As simple as "write a new executive summary" or "move /project-page-paragraph.html to /project-info/project-page-paragraph.html". In some cases, we could even provide the translation for the reader, e.g., from .dates.txt to project-info.xml <releases>.
*** Bug 109223 has been marked as a duplicate of this bug. ***
See bug 109223 for some things to implement in this bug.
Also see bug 109219
Questions: 1: If we are using the new XML format, how are we gona handle the "summary" tag?, because in the new format that tag contains an URL to where the summary is. 2: From where are we getting the past XML files to compare with?, are we defining a directory or is it already defined?
You'll need to cache the old files to compare against. The summary URL points to some content; fetch that content; compare it against the cache.
(In reply to comment #3) > This program should run once a week. The correctness rules should be in a > separate PHP object so that they are easy to add to and modify. > > <summary> with a copy of the summary from three months ago and if they are the > same, include a reminder in the email to the project lead. So the program is to run every week updating the cached summary and comparing it but only send out emails every third month?
Question: What format are we using for the "correctness rules"? An example will be very useful.
Correctness rule: PHP code or Perl code; whatever works. I can imagine a directory of class files, each class file defining a process( new, old ) method. The method returns a string or undef. If a string, the string is the error message to send. The new and old are the project-info objects now and the previous version. Obviously the project-info object needs to be extended with error handling and with timestamps for this to work. For example, one correctness check would be that the <summary> tag includes a @paragraph-url tag. If it includes a @url attribute without a @paragraph-url attribute, the error message would say something like "be sure to include a paragraph-url in addition to the url you already have; the difference between the paragraph-url and the url is ...". If it doesn't exist or doesn't have any attributes, the error message would be different, perhaps "be sure to include a <summary> tag with a paragraph-url as per the instructions here..."
Question: What time stamp do we want to have inside the project-info object? Is the one that keeps record of when the object was created?
Various bits of the info will need their own time stamps. For example, the time stamp for the summary url page will be different from the time stamp for the summary paragraph-url will be different from the time stamp for the overall project-info.xml file.
So we have new and old project-info objects for each project... But how are we creating the old one? since the constructor receives the key of the project and pulls its data directly from temporary or the web.
You'll have to cache them away when you run the tool on a regular basis. The different rules will probably have different time spans so we might need to cache them for different lengths of time, but I imagine that a rule would want to say "for this project-info, give me an old version as of 3 months ago". And then I could compare the old and new text, timestamps, etc.
Created attachment 33834 [details] FIles involved with the bug
I had place the working program on "http://phoenix.baueralonso.com/projects/auto_reminder.php". I would like you (Bjorn) to take a look at to see what we are doing. Right now we have: - Check for updates on the executive summary. - Check for updates on releases. - Check not using "dates.txt". - Check not using "project-page-paragraph.html" in project root folder. There are some comments/questions: 1. Is this tools running on a cronjob or from web? 2. I had created a "projects/cache" directory to save all the temporary "old" objects copies. But we need this directory to have writing permissions for "others". Is that possible on your side? 3. I'm printing the timestamps we are putting on the project-info object. But we are still not sure about its functionallity. 4. The output of this program is kind of what the email will contain for each project. 5. From where can get obtain the e-mail address of each project leader? 6. The warnings it shows are because of non-existent project directories inside temporary. Are we creating these? 7. What other elements/files should this tools check? I'm attaching a zip with all the files involved with this bug. Just in case you want to check the logic we are using.
Re comment 17 - > 1. Is this tools running on a cronjob or from web? As per the original spec (see above), it's a cron job. > 2. I had created a "projects/cache" directory to save all the temporary "old" objects copies. But we need this directory to have writing permissions for "others". Is that possible on your side? I don't know - how would I check? > 3. I'm printing the timestamps we are putting on the project-info object. But we are still not sure about its functionallity. What's the question? > 4. The output of this program is kind of what the email will contain for each project. Well, I think we'll need to work on a more friendly prose > 5. From where can get obtain the e-mail address of each project leader? The Foundation database has a project leader relationship for each project. The project-leaders (could be more than one) each have an email address. > 6. The warnings it shows are because of non-existent project directories inside temporary. Are we creating these? I don't understand the question. The code should work whether or not there are temporary project-info.xml files. > 7. What other elements/files should this tools check? See comment 11 - there should be a sub-directory named "rules" or something that contains PHP files. The main checker will run each of those rules on the project info. Thus I can easily add additional rules by creating a new file in that directory. > Just in case you want to check the logic we are using. See comment 11 - these should be in a sub-directory and not hard-coded into the main class.
(In reply to comment #18) Re comment 17 - > 2. I had created a "projects/cache" directory to save all the temporary "old" objects copies. But we need this directory to have writing permissions for "others". Is that possible on your side? I don't know - how would I check? > It's a permissions issue. It is technically possible, just want to make sure there is no policy restriction (or something) on your side. (I guess not =) ). > 3. I'm printing the timestamps we are putting on the project-info object. But we are still not sure about its functionallity. What's the question? > Right now we create a timestamp for each element that is checked by the autoreminder tool. We create those timestamps when creating the project-info object, thus all the elements has the same information on their timestamp. But in the comment (#13) above, it says each element has to have different timestamps, how are we handling these?, when do they have to be created?
What I meant about the time stamps is that the project-info.xml will have a time-stamp, but if the project-info.xml contains <summary url="foo.html"> then foo.html needs to have a timestamp as well and that timestamp will be different than the project-info.xml timestamp. So each XML tag/attribute that refers to a separate file will create another timestamp to consider.
Right now we only have knowledge of the "projects" table inside the foundation DB. Is there another table inside this DB from where we can obtain the e-mail addresses of the project leaders?
Yes, but you'll have to get the schema and the PHP class to access the table, from Denis.
Created attachment 33919 [details] Last Version Only missing the E-Mail stuff.
Just added a new zip with all the files involved with the last version we have for this project. We are only missing the e-mailing stuff for the project leaders. (Waiting for the tables to get that information). But you (Bjorn) may want to take a look at the structure of the files and post any comment. The following text is a sample of the e-mail that is going to be send to the project leaders: <!----- Project leader, This is an auto-reminder sent every quarter, you don't need to do a reply. We have detected that some information is not up to date, we recommend the following actions: - Please write a new executive summary. - Please update your releases information. - Please put your project-page-paragraphs.html file inside your project-info folder. ----->
So far we are using complementary code with the PHP License, is this ok? It would be nice to know which types of licenses are available to use (or which not).
We do not publish a list of licenses - all projects have to go through the legal clearance process for third-party code.
Created attachment 34830 [details] Latest Version 02/15/06 Latest version 02/15/06
The latest version is on the attachment section. Right now it is working on: http://phoenix.baueralonso.com/projects/auto_reminder.php There is a sample of the email that this tools will be sending. It is suppose to have the following new features: 1. It shouldn't send email if there are no problems. 2. The project_page_paragraph_file_rule.php header comments says that it is the Dates File Rule - so then I begin to wonder if there are other problems with the comments - please fix the comments and I'll look again 3. The rule files need comments that describe what the rule they are checking. 4. Instead of having a function inside each file and having the function name match, I want to have (a) a class in each file and (b) the code greps for "class XXX" to figure out which class to instantiate and (c) the class has a standard process( old, new ) method. 5. Why are you searching in "temporary/" rather than in the project's correct URL? 6. Doesn't the releases rule just complain every month? I don't think that's a good idea. The idea of the releases rule was to check the timeline and if there was a scheduled release for Dec 2005 and it's now January 2005 and the release is still "scheduled" instead of "completed", to hint that maybe the releases information is out of date. 7. The executive summary rule should only complain if the summary has not been changed in the last quarter, not if it has not been changed since the last time this code was run. 8. I notice that timestamps are not used in any of the rules - why is that? 9. We cannot include any GPLed code in our CVS. GPL is not compatible with EPL. Thus the php_diff routine cannot be checked in. So far we have the following documentation: http://phoenix.baueralonso.com/projects/common/rules/README_FIRST.html
Created attachment 35262 [details] Auto Reminder Tool Patch Run it inside "/www/projects/"
Marking as fixed and attaching the patch... IMPORTANT NOTES: 1. There is a special directory "projects/cache/" where all the "old objects" that will be compared against to on the next ride are stored. This directory will contain a sub-directory for each project. Becuase the tool needs to read and write new data for each project, the whole "/cache/" and all its sub-directories MUST have write access for "apache" user. If possible, set the permissions of the "cache/" and "cache/*" directories to: '777', which means "everyone" can write on that directories (including apache). 2. The auto-reminder tool was built to run on a cron-job every quarter, so this feature has to be configured properly, we have include "www/projects/common/run_autoremind.php" routine in the patch, this is the file that has to be executed by the cronjob, in order to do that the following lines have to be executed: shell> crontab -e inside the crontab editor add the following line: * * 01 */3 * /path/to/CVS/www/projects/common/run_autoremind.php 3. The tool can always be executed via web by just putting the URL on the browser. How ever, everytime the tool is executed the "old_project-info" objects are updated to the current information available, so all the old data is erased, that's why we recommend to let the cronjob do its job. 4. Inside the file projects/common/rules/summary_paragraph_rule.class.php there is a subroutine called "diff_compute", this function uses the "diff" command, right now it is configured to find that command into "/usr/bin/diff", but if that is not the location of the "diff" on your side, please change the path to where the binary "diff" is. 5. The documentaton is available on "projects/common/rules/README_FIRST.html" and "projects/common/rules/descriptions.php" 6. Run the patch inside "/www/projects/"
This whole thing needs re-architecting. First, bug 129861 and bug 129862 need to be solved. Then this system will be built. There are a number of components: 1. A cron job that runs every night and runs a set of rules. After it finishes running all the rules, it runs a second process that sends out any necessary reminder messages. 2. Log files that recycle every month that record all the behavior of the system so that we can track down what it is doing. 3. A cache of older results used by the rules to look for changes. 4. Fetcher objects which retrieve various bits of data. For example, a ProjectInfo fetcher uses the REST web-api to fetch the latest ProjectInfo object. For example, a description-paragraph-fetcher uses the information in the ProjectInfo object to fetch the summary paragraph. The results of all of these are stored in the cache by timestamps so that differences can be determined. 5. Each rule is a separate php class that (a) runs all the fetches it needs (fetchers only run once each day and use cached results all other times); (b) checks the consistency of the results or compares current and previous results or whatever; (c) logs any messages that need to be sent. 6. Complete documentation of how this all works. P.S. The whole timestamp system should be removed from ProjectInfo and its related classes. The idea is that it should be easy to add a new rules (new php classes) and new data sources (new fetchers). The code for rules should be simple – something like: X <- xml of current ProjectInfo for $project; Y <- xml of ProjectInfo of one month ago for $project; if X = Y, then log an email “you need to update your project info” – it should be that simple to write these rules.
Created attachment 37935 [details] Image for the documentation This image could not be added to a patch through Eclipse. It would generate a never ending blank file. It should go into $_SERVER['DOCUMENT_ROOT']/projects/common/doc/howstuffworks/
Fixed by patch sent in 129862
Image file checked-in
Patch from bug 140952 applied.
Closed.