Community
Participate
Working Groups
We have seen a need to store task-related information across various components in the frameworks. Currently this information is stored in preferences, the task list, files and often kept in memory even though it is only needed in certain scenarios. Some examples for task-related data: * Private notes * Activity * Interaction history * Editor mementos * Version information (e.g. branches) I propose that we create a new directory structure with one directory per task to aggregate all information for a task that is not stored in the task list or task data cache.
Here is a suggestion for the directory structure (from bug 355031): pre. .mylyn/data/bugs.eclipse.org/1234/ attachments/ unsubmitted-patch.txt activity.xml changesets.xml context.xml notes.txt task-edits.xml version.psf workspace-state.xml The cached data could be separated in another directory to make it easier to manually free disk space for instance. pre. .mylyn/cache/bugs.eclipse.org/1234/ attachments/ screenshot.png task-data.xml task-lastread.xml
Additionally, there is per-repository information that could also be persisted in this store: * Repository configuration * Saved queries
So I thought I'd follow up on the discussion on the Mylyn call "There ought to be a model for this". The discussion is actually far more general than this, but at the risk of going way off topic, I thought this would be a good place to peg it now. As I was wrapping my head around this particular task, I was asking myself "what kind of information do we want to keep related to a given task?". And it seemed that the answer is "a pretty complete cross-section of the now internal data structures that Mylyn uses". So any discussion of how to persist task information naturally leads into the question of what the overall approach to dealing with all of the user data within Mylyn, and what could that approach ideally be. And, yep, much of it is begging for a model. I'm still pretty unfamiliar with the Mylyn internals, but what I am seeing is a lot of different places where there is highly structured data that it would be nice to be able to get at in many different ways but with well defined rules to preserve integrity and so on. Taken a pretty simple example, I looked at context store. Here it would not be difficult at all to create an EMF model that could gracefully replace the existing nicely abstracted one. Ideally, the objects such as IInteractionContext would be modeled objects themselves, but we could also preserve the existing API. What are the benefits of that? Without getting into everything that an EMF model provides, I'll randomly throw some ideas out: 1. Easy distribution and introspection so for example a user could identify all Interactions of a certain type within a given period and just grab those elements for a new context. 2. Much more fine-grained ability to slice and dice contexts. 3. Notification, so for example a view could listen for just interactions of a particular type rather than the current approach of having all bridge implementations need to manage all interactions that come their way.. 4. Edit command support and validation so that consumers can't accidently put inconsistent information into the context store 5. Transparency of contexts to users so that they could actually see what they've been doing, analyse it, etc.. 6. Transform contexts between workspaces as appropriate. 7. Arbitrary persistence mechanisms, so that contexts could be stored in an XML file as currently is happening, on an enterprise DB for access from any workstation, or *in an efficient binary format* so that you don't have to deal with performance and maintenance overhead of dealing with zip files. .. And then adding other EMF technologies: 1. CDO: keep distributed enterprise wide contexts updated in real-time across workspaces. 2. EMF Compare: Users can see the deltas between different contexts to understand the overlap between various task types, say. ... I'll stop there. Now imagine the same thing for tasks, activities, etc.. So that's hopefully motivating the overall picture. Sorry, got a bit carried away. But for the case of persisting task related data, here's how that might work: Create a model of all of the potential parts just as Steffen has done here, but without a particular persistence mechanism, i.e. file structure. Then we can map that to other resources in an extensible way. Some of those might be appropriate for a model -- e.g. contexts, task meta-data, activity (?).. -- and some not -- i.e. attachments, changesets and other artifacts that have external representations that we can't or don't want to change. All of these can be represented within the model. Then we can map that to a local store persistence mechanism that would place them in the directory location or whatever. Since at that point we're interested in the model and not the particular data store implementation, it will be much easier to change that or make it configurable, so again if people want to keep task meta-data in a separate repository they could. Again, I'm not familiar enough with the current mechanisms and project history if some of this is already happening or if I've got pieces of this confused or if I'm just generally clueless about what's helpful functionality wise, but to me there looks like a potential for a big functionality and maintainability win here with a significant but I think still pretty constrained effort. I also think that it's possible to do at least some of this gradually, without disrupting the current API or at least providing a smooth transition path.
Thanks Miles. That's very valuable input. It'd be great to formalize the models that compose the task context but it's beyond the scope of this bug and we won't be able to take that on short term. As we improve the context persistence story we should keep your suggestions in mind and we can start working on towards defining those models. As discussed on the call, the first step here is to provide a simple API that support storing content on disk on a per task basis. It could make sense to simply create task directories under .metadata/.mylyn/tasks/bugzilla-http.../data/. We could either extend TaskDataManager or create a separate store to handle that.
(In reply to comment #4) > Thanks Miles. That's very valuable input. It'd be great to formalize the models > that compose the task context but it's beyond the scope of this bug and we > won't be able to take that on short term. As we improve the context persistence > story we should keep your suggestions in mind and we can start working on > towards defining those models. Sounds good. I've created bug 368203 to track that possibility.
Manuel, I might take a first pass at this at the end of this week. Have you already started working on the code?
Nothing relevant. Just looked around and tried to understand the code. Please go ahead. Since i have vacation next week, i think i can join your work then.
I have pushed a review here: http://review.mylyn.org/308. The proposed API in AbstractTaskContextStore is commented out but it should give you an idea where I'm going with this. Files are created under the tasks directory, e.g. for locat task with the ID 2: .mylyn/tasks/local-local/data/2/. The data directory is on the same level as the offline directory that stores cached task data.
I'll take this for now since we are past the contribution dead line for 3.7. We can always open new bugs any defer any changes or enhancements to 3.8 if you want to work on this further, Manuel.
It would be nice for the framework to support storing information on a task which is automatically persisted only when the task is saved, similar to the key/value pairs you can store on ITask.
Mylyn has been restructured, and our issue tracking has moved to GitHub [1]. We are closing ~14K Bugzilla issues to give the new team a fresh start. If you feel that this issue is still relevant, please create a new one on GitHub. [1] https://github.com/orgs/eclipse-mylyn