You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(28) |
Nov
(87) |
Dec
(16) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(109) |
Feb
(107) |
Mar
(117) |
Apr
(5) |
May
(156) |
Jun
(83) |
Jul
(86) |
Aug
(25) |
Sep
(17) |
Oct
(14) |
Nov
(82) |
Dec
(50) |
| 2004 |
Jan
(14) |
Feb
(75) |
Mar
(110) |
Apr
(83) |
May
(20) |
Jun
(36) |
Jul
(12) |
Aug
(37) |
Sep
(9) |
Oct
(11) |
Nov
(52) |
Dec
(68) |
| 2005 |
Jan
(46) |
Feb
(94) |
Mar
(68) |
Apr
(55) |
May
(67) |
Jun
(65) |
Jul
(67) |
Aug
(96) |
Sep
(79) |
Oct
(46) |
Nov
(24) |
Dec
(64) |
| 2006 |
Jan
(39) |
Feb
(31) |
Mar
(48) |
Apr
(58) |
May
(31) |
Jun
(57) |
Jul
(29) |
Aug
(40) |
Sep
(22) |
Oct
(31) |
Nov
(44) |
Dec
(51) |
| 2007 |
Jan
(103) |
Feb
(172) |
Mar
(59) |
Apr
(41) |
May
(33) |
Jun
(50) |
Jul
(60) |
Aug
(51) |
Sep
(21) |
Oct
(40) |
Nov
(89) |
Dec
(39) |
| 2008 |
Jan
(28) |
Feb
(20) |
Mar
(19) |
Apr
(29) |
May
(29) |
Jun
(24) |
Jul
(32) |
Aug
(16) |
Sep
(35) |
Oct
(23) |
Nov
(17) |
Dec
(19) |
| 2009 |
Jan
(4) |
Feb
(23) |
Mar
(16) |
Apr
(16) |
May
(38) |
Jun
(54) |
Jul
(18) |
Aug
(40) |
Sep
(58) |
Oct
(6) |
Nov
(8) |
Dec
(29) |
| 2010 |
Jan
(40) |
Feb
(40) |
Mar
(63) |
Apr
(95) |
May
(136) |
Jun
(58) |
Jul
(91) |
Aug
(55) |
Sep
(77) |
Oct
(52) |
Nov
(85) |
Dec
(37) |
| 2011 |
Jan
(22) |
Feb
(46) |
Mar
(73) |
Apr
(138) |
May
(75) |
Jun
(35) |
Jul
(41) |
Aug
(13) |
Sep
(13) |
Oct
(11) |
Nov
(21) |
Dec
(5) |
| 2012 |
Jan
(13) |
Feb
(34) |
Mar
(59) |
Apr
(4) |
May
(13) |
Jun
(1) |
Jul
(1) |
Aug
(1) |
Sep
(3) |
Oct
(2) |
Nov
(4) |
Dec
(1) |
| 2013 |
Jan
(18) |
Feb
(28) |
Mar
(19) |
Apr
(42) |
May
(43) |
Jun
(41) |
Jul
(41) |
Aug
(31) |
Sep
(6) |
Oct
(2) |
Nov
(2) |
Dec
(70) |
| 2014 |
Jan
(55) |
Feb
(98) |
Mar
(44) |
Apr
(40) |
May
(15) |
Jun
(18) |
Jul
(20) |
Aug
(1) |
Sep
(13) |
Oct
(3) |
Nov
(37) |
Dec
(85) |
| 2015 |
Jan
(16) |
Feb
(12) |
Mar
(16) |
Apr
(13) |
May
(16) |
Jun
(3) |
Jul
(23) |
Aug
|
Sep
|
Oct
|
Nov
(9) |
Dec
(2) |
| 2016 |
Jan
(12) |
Feb
(1) |
Mar
(9) |
Apr
(13) |
May
(4) |
Jun
(5) |
Jul
|
Aug
|
Sep
(10) |
Oct
(11) |
Nov
(1) |
Dec
|
| 2017 |
Jan
|
Feb
(1) |
Mar
(11) |
Apr
(8) |
May
|
Jun
(6) |
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
(2) |
Dec
(1) |
| 2018 |
Jan
(6) |
Feb
(6) |
Mar
(3) |
Apr
(9) |
May
(3) |
Jun
|
Jul
|
Aug
(3) |
Sep
(8) |
Oct
(1) |
Nov
(1) |
Dec
(4) |
| 2019 |
Jan
(4) |
Feb
|
Mar
(1) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
| 2020 |
Jan
(22) |
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(2) |
Aug
(2) |
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
| 2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
(2) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2022 |
Jan
(1) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
(5) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2024 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2025 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(7) |
2
(25) |
3
(8) |
4
(7) |
5
|
|
6
|
7
(3) |
8
(5) |
9
(2) |
10
(2) |
11
(2) |
12
(1) |
|
13
|
14
|
15
(10) |
16
|
17
|
18
|
19
|
|
20
|
21
|
22
(10) |
23
(7) |
24
(1) |
25
(2) |
26
|
|
27
|
28
(2) |
|
|
|
|
|
|
From: Chris M. <cj...@fr...> - 2005-02-28 23:07:11
|
On Thu, 24 Feb 2005, Scott Cain wrote: > Hello, > > At the moment, when someone does `perl Makefile.PL` to install chado, > the user is presented with a list of 4 functions that he can optionally > install (listed in modules/function.modules): > > sequence/gff-bridge/sequence-gff-funcs.pgsql > sequence/functions/feature_subalignments.plpgsql > sequence/functions/get_sub_feature_ids.plpgsql > sequence/apollo-bridge/apollo-triggers.sql > > I suspect that there are other functions (written mostly by folks at > Harvard_ that I don't know anything about, but would probably be > generally useful. Am I right? What other functions in schema/chado > should be optionally installed? There's also the stuff in the cv module that Shu wrote for recursively traversing the cvterm_relationship table and populating cvtermpath I think in general it is ok to have functions installed by default, but we should be wary of installing triggers by default (unless these are triggers on bridge tables) here is how I see functions working in chado Every module should have a DBMS API public interface specification, in SQL or pseudo-SQL. The chado user (ie coder) can expect to find all these functions implemented - they shouldn't care how they are implemented. The chado installer should have a default set of implementations they can install without thinking too hard. The advanced chado installer may decide to have their own implementations. Reasons for a non-default installation may be: different DBMS; different policy (eg feature naming functions) I am attaching an example of what the interface specification for the sequence module may look like - if people are happy with this, Shu and I can use the same syntax to document the functions that currently exist which should be part of the core API > A related question is what should be in the default database schema. At > the moment, the first three in the list above are included in the > default schema, if there are more added to the optional list, which > among those should be put in the default? > > Thanks, > Scott > > |
|
From: Aubrey de G. F. a. <ag...@ge...> - 2005-02-28 12:34:57
|
I should probably clarify, for the record, that here: > I've chatted with Aubrey about this, and while he has some fairly > compelling arguments as to why the cv and dbxref modules could be merged > into one, he agrees with me that at this point in the game, the cost of > making such a fundamental change in the schema would be much greater > than any benefit we might realize. I only agree about this on the basis of what Dave reports about software already written by others. I still have grave concerns that leaving the cvterm and dbxref tables separate will be a long-term aggravation that we'll ultimately regret. But for now, as Dave says, adding the two new tables does what we need. Cheers A |
|
From: David E. <mi...@tm...> - 2005-02-25 18:57:03
|
Hi All, Having resolved the "Tables with text fields in unique key" thread, we need to push through this cvterm/dbxref thread. I'll try to be brief, as Chris is probably holding his breath. I've chatted with Aubrey about this, and while he has some fairly compelling arguments as to why the cv and dbxref modules could be merged into one, he agrees with me that at this point in the game, the cost of making such a fundamental change in the schema would be much greater than any benefit we might realize. The cvterm/dbxref thread got started because FlyBase needs to be able to do these two things with dbxrefs which are not presently supported: 1) link properties (eg, comments) to the relationship between a feature (eg, "gene"), and a dbxref (eg, GB-acc#). 2) maintain relationships between dbxrefs, for example, the relationship between a GenBank nucleotide record accession number and the protein_ids for proteins coded in that GenBank record. I propose that we add a feature_dbxrefprop table to the sequence module to accomodate the former, and a dbxref_relationship table to the dbxref module to accomodate the latter. Chris suggested that we should consider using feature & feature_relationship for managing the relations between these acc#. Under some circumstances, this might be a proper approach, but for what FlyBase needs to do, I think it would be extremely heavy. All we care about are the acc#; in order to implement these as features, we'd have to do quite a bit more work than just know the acc#. Also, we'd be in a position of having dbxref relationships happening via two routes, feature_dbxref and feature_relationship, which I'm not too keen on. This is getting long, and Chris is probably turning blue, so I'll stop here. Unless there are objections, I propose to go forward adding these two tables to sourceforge early next week. Breathe! -Dave On 02/07/05 at 07:22 Aubrey de Grey FlyBase account wrote: > > Hi Chris, > > > Is interpro really central to this? > > Yes, of course it is! It's the only way that I've succeeded in > showing > you that the distinction between what is an ontology and what isn't is > "perhaps fuzzier than I originally made out it was". Now all I > need to > do is go a teeny bit further and show that that distinction is in fact > a carbuncle on the face of chado -- a valueless, arbitrary matter of > pure taste that has no business being enshrined in the structure of a > database with the ambitions of chado. More to the point I need to > demonstrate that enshrining it (by having cvterm and dbxref as > separate > tables) has a bunch of big disadvantages in actual practice that > we can > already see as we implement Cambridge data in chado, and no remotely > comparable advantages. > > > For chado normalization and accurate modeling comes first. In > general > > you'll have to do a lot more joins to get the same data from > chado than > > most other dbs. This doesn't mean chado isn't all-purpose, but > it does > > sacrifice speed for normalization. > > Makes fine sense (except that normalisation can perhaps be taken > too far, > as we're seeing in the other thread). But then if we can get > fewer joins > WITHOUT sacrificing normalisation, that's still a big plus, isn't it? > And this proposal has no effect on normalisation whatever (see below). > > > > Please can you also elaborate on the disadvantages you > mention above. > > > What are they? > > > > non-uniformity (all other dbxref/IDs are stored in a dbxref > table, apart > > from GO/OBO dbxrefs/IDs) > > I have no objection to the merged table being called dbxref > rather than > cvterm if that makes people happier, but maybe that's not what > you mean. > In the proposal I'm making, all dbxref/IDs are stored in one > table -- so > please elaborate. > > Also, hang on, GO dbxref/IDs **are** stored in dbxref in the > chadoXML I'm > getting from Pinglei -- what do you mean here? > > > denormalisation (the GO/OBO dbxref/ID is collapsed into a > single string, > > making it necessary to do string match queries to ask > questions like "how > > many terms in GO vs SO" (though in practice the same result can be > > obtained by using the equivalent of cv_id, although as noted > before this > > isn't always 1:1). > > um, you seem to be answering yourself here and pointing out that > we don't > have to do string matches. > > > it also means any kind of automated ID granting has to > > be mediated via code, which isn't a problem since this is all > handled > > ex-database for GO anyway) > > So you're saying this is not a disadvantage elther? So what > **are** these > alleged disadvantages (apart from "non-uniformity", which I need > you to > explain more thoroughly)? > > > > > First of all, your example sql query will return X01803 > because you said > > > > that this has the name ATPG418 > > > > > > No, not unless we populate FlyBase with GenBank LOCUS > fields, which we can > > > choose whether or not to do and have historically not done. > > > > But a change in policy would break existing queries that are purely > > cv-oriented and have nothing to do with genbank locus fields > per se. Sign > > of a problem > > Not at all! -- changes in policy always break things that rely > on the old > policy. If a query wants to exclude GenBank it can exclude it > by saying > so explicitly, rather than by saying "only use rows that have a > non-null > name". GenBank is no different from Interpro here -- we might > start off > not populating cvterm for Interpro and then change the policy, > and results > of "cv-oriented" queries would change. > > > > No - cv_id and accession would both be non-nullable. name > would be > > > nullable but that's fine, we don't need it in the UC. > > > > on the contrary, this is a significantly loss if we can't > guarantee that > > name and cv_id can uniquely identify a (non-obsolete) term > > > > I think this is enough to kill the proposal > > Hang on -- you were saying a few rounds ago that names should > never be used > for such things, only IDs should: > > > You should only supply the dbxref, and the dbxref only. If you > want to > > provide cvterm/name, it could be problemmatic. What if you're > supplying > > GO annotations and your version of GO is lagging behind the > actual GO? > > An out-of-date name from you will clobber the up-to-date name in the > > database. > > and I agreed with this (and still do), even though it means CVs > that are > not currently in OBO being put there. Please give us a concrete > example > where one would want to use name+cv_id as a unique constraint. > > > I'm still in the dark as to what you're proposing > > is it > > table cvterm > > cvterm_id PK > > name ??NOT NULL > > cv_id NOT NULL > > db_id NOT NULL > > accession NOT NULL > > > > ??UNIQUE (name, cv_id) > > UNIQUE (db_id, accession) > > No, it's this: > > table cvterm > cvterm_id PK > name > cv_id NOT NULL > accession NOT NULL > > UNIQUE (cv_id, accession) > > No db_id here, because that's in the cv table. So not this: > > > db(1,GO) > > db(2,GenBank) > > cv(1,cellular component) > > cv(2,nucleotide) > > cvterm(1,synaptic blah,1,1,0008021) > > cvterm(2,aptg418,2,2,X01803) > > but rather: > > db(1,GO) > db(2,GenBank) > cv(1,cellular component,db_id=1) > cv(2,nucleotide,db_id=2) > cvterm(1,synaptic blah,1,0008021) > cvterm(2,<null>,2,X01803) <--- "aptg418" can be added later if > we choose > > > OK, a lot of the time with interpro, you just know the dbxref, > and maybe > > that's all you want to populate. You have the option of > loading the whole > > interpro ontology at some later time, as cvterms, and then > filling in all > > your extra info. > > OK, so just to be totally explicit, your answer to my question: > > > Where in the current chado schema would you put Interpro names? > > is "in cvterm.name", right? > > > This is analagous to features - often you'll only know the > dbxref, and > > that's all you'll populate. Then you may want to load the equivalent > > genbank records, giving you fully populated feature rows. > > > > I've tried to explain that the main difference is in the > desired meaning > > of the table; this part can't be explained in SQL. The cvterm > table is > > more modeling concepts. The dbxref table is for modeling > addresses. A > > cvterm has a dbxref. Other things can have dbxrefs too. A > feature can have > > a dbxref. Where the dbxref_id is the UC for a table, the > dbxref (and thus > > its stringified UC) can stand in for a particular piece of > data, whether > > that piece of data is a feature or a concept. > > > > GenBank dbxrefs stand for features, GO dbxrefs stand for > concepts/cvterms. > > And Interpro dbxrefs stand for, hm, something in between. This > is the core > of the problem. It was only in this thread that you first > suggested the > idea of putting GenBank records in feature, and you immediately > pointed out > yourself that that is actually rather fraught: > > > Note that this will make things trickier for the > apollo/annotation side > > of chado, because these links will a little like alternate > spliceforms. > > and it's actually worse than that, because (eg) a single GenBank > nucleotide > record can represent more than one sequence, via the "variation" > feature. > > Consider this generally (since chado is supposed to be > all-purpose). What > the current schema requires is that any controlled data type > which has ANY > of the following features: > > - an inheritance hierarchy (relationships between its own items) > - relationships to items of other controlled data types > - names for its items that one would like to store in addition > to the IDs > - links to features that need attribution > - properties of its items or of the links of its items to features > > has to be instantiated either as a CV (using cv and cvterm) or > as a type > of feature (using the feature table) as well as being a database with > db and dbxref. This is necessary because cvterm and feature have > *_relationship, feature_cvterm has a pub_id column, and feature and > feature_cvterm both have *prop, whereas dbxref has none of those > things. > But, feature can only store mappable objects. So, a controlled data > type consisting of things that are not mappable has to be > treated as an > ontology unless we want to ordain that it shall never have ANY of the > five features listed above. Whereas, as far as I can tell (though you > still haven't really said), you think that something should not be in > cvterm unless it either has an inheritance hierarchy or is stored in > OBO. Conversely, even GenBank *has* an inheritance hierarchy -- the > CON records subsume smaller ones -- so it qualifies to go into cvterm > even by your definition, doesn't it? > > -------------------- > > So what we're left with at this point is that the proposal to > merge the > cvterm and dbxref tables has the following substantial advantages: > > - fewer joins > - simpler XML, simpler code everywhere > - no need to force GenBank into feature > - no need to decide in advance whether something is going to be > treated > as an ontology or not > > and no disadvantages at all except that all CVs have to be in a > database > somewhere, default being OBO, which we agree is actually a > constraint we > should be enforcing anyway so that queries/updates are done on IDs and > not on names. > > If you've got other disadvantages to raise, bring them on; eg I > hope now > you can give examples of this: > > > it makes all kinds of basic ontology queries harder and weirder. > > Cheers A |
|
From: Scott C. <ca...@cs...> - 2005-02-25 04:12:38
|
Hello, At the moment, when someone does `perl Makefile.PL` to install chado, the user is presented with a list of 4 functions that he can optionally install (listed in modules/function.modules): sequence/gff-bridge/sequence-gff-funcs.pgsql sequence/functions/feature_subalignments.plpgsql sequence/functions/get_sub_feature_ids.plpgsql sequence/apollo-bridge/apollo-triggers.sql I suspect that there are other functions (written mostly by folks at Harvard_ that I don't know anything about, but would probably be generally useful. Am I right? What other functions in schema/chado should be optionally installed? A related question is what should be in the default database schema. At the moment, the first three in the list above are included in the default schema, if there are more added to the optional list, which among those should be put in the default? Thanks, Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. ca...@cs... GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory |
|
From: Linda S. <Lin...@cg...> - 2005-02-24 10:29:14
|
Hello, On Feb 23, 2005, at 6:06 PM, Chris Mungall wrote: > On Wed, 23 Feb 2005, Pinglei Zhou wrote: > >> >>> Hi Pinglei, >>> >>> This schema looks good (it solves the unique observable_id problem,=20= >>> and >>> much more). But, I still have two questions (maybe it is one >>> question): >>> >>> Where do you put genetic interactions (suppresion, enhancement,=20 >>> etc.)? >> >> it should involve phenstatement/phenotype_comparison tables. >> >>> What is the purpose of the phenotype_comparison table? >> >> phenotype_comparison will replace the old >> interaction/interactionobject/interaction/subject table > > I don't think this is an adequate explanation! The > interaction/interactionobject/interaction/subject tables were removed=20= > from > the genetic module at the end of 2003. > > I was under the impression that the version of the genetic module that=20= > was > in cvs was the latest one, and relatively stable. So did we, which is why we started working with it. The fact that the=20= xml dtd was for a different schema was the tip off that something had=20 changed, but we didn't know when (i.e. was the dtd older or newer than=20= the current chado genetic schema). > This is the version > based around 'gcontext's, and reasonably well commented. Yes! The comments are extremely useful. > It seems that > this has been abandoned and there is now a new version. It would=20 > therefore > seem like a good idea to commit this new one to cvs and to add some > comments to it We would really appreciate that. Let me make a few comments and raise a few questions based on our=20 comparison (since yesterday afternoon) of the old(gcontext) version and=20= the current flybase version that Pinglei kindly sent us. We would like=20= to understand the design before deciding to modify it! And we would=20 prefer not to modify at all (other than by adding a stock sub-module=20 which we are happy to share) if we can handle our current (small=20 compared to drosophila) stock collection and our rnai data. gcontext table has become genotype table. OK phenstatement table has been replaced by phenotype and phenstatement=20 tables, this is much better (for us) because there is better=20 normalisation. Environment was previously in gcontextprop, at least according to the=20 comments in the sql. There are now two tables, environment and=20 env_cvterm, and there is no longer a gcontextprop table. We have a=20 problem with this for the following reason.: EITHER there is one=20 environment property (for example temperature_degrees_c) in which case=20= there is no need for env_cvterm table, it would be sufficient to use=20 the environment table, add type_id for the cvterm and use description=20 for the value, OR an environment involves more than one property (for=20 example temperature_degree_c and humidity_percent) but in that case,=20 there should be a value (text) in the env_cvterm table! Because it is=20= not logical to put two values in the description field in the=20 environment table. In this case, it would seem more logical for=20 env_cvterm to be called environmentprop (with key, value pairs) : table environmentprop environmentprop_id [U,PK] environment_id [U] type_id [U] value text Two tables now contain type_id, phenstatement and phentoype_comparison. We are guessing that the type_id in phenstatement could be used for=20 'dominant, semi-dominant, recessive etc.'. Is this the intention of=20 the design? If so (and even though this is **not** a problem for our=20 Paramecium data) how would you include something like=20 penetrance_pct=3D80? Again, where would the value go? Or are we missing=20= the point on the type_id? We are guessing that the type_id in phenotype_comparison is for genetic=20= interactions (suppression, complementation, enhancement etc.). The=20 problem we have here, is that we do not understand why there is=20 environment1 and environment2, since the genetic interaction (between=20 genotype1,phenotype1 and genotype2,phenotype2) is being evaluated in a=20= given environment. Linda > >> Pinglei >> >> >>> >>> On Feb 23, 2005, at 4:45 PM, Pinglei Zhou wrote: >>> >>>> Hi Linda, >>>> >>>> Not sure if I mentioned before, genetic module is still under >>>> development, we (FlyBase) are testing on a new >>>> design which is much different from the one in cvs. Here test I=20 >>>> means >>>> we try to feed Drosophila genetic to see >>>> if it will break the schema. welcome to give any suggestions. >>>> >>>> >>>> Pinglei >>>> >>>> >>>> >>>> >>>> >>>> genetic module currently tested by FlyBase >>>> >>>> create table genotype ( >>>> genotype_id serial not null, >>>> primary key (genotype_id), >>>> uniquename text not null, >>>> description varchar(255), >>>> unique(uniquename) >>>> ); >>>> create index genotype_idx1 on genotype(uniquename); >>>> GRANT ALL on genotype_genotype_id_seq to PUBLIC; >>>> GRANT ALL on genotype to PUBLIC; >>>> >>>> >>>> >>>> create table feature_genotype ( >>>> feature_genotype_id serial not null, >>>> primary key (feature_genotype_id), >>>> feature_id int not null, >>>> foreign key (feature_id) references feature (feature_id) on = delete >>>> cascade, >>>> genotype_id int not null, >>>> foreign key (genotype_id) references genotype (genotype_id) on=20= >>>> delete >>>> cascade, >>>> chromosome_id int, >>>> foreign key (chromosome_id) references feature (feature_id) on=20= >>>> delete >>>> set null, >>>> rank int not null, >>>> cgroup int not null, >>>> cvterm_id int not null, >>>> foreign key (cvterm_id) references cvterm (cvterm_id) on delete >>>> cascade, >>>> unique(feature_id, genotype_id, cvterm_id) >>>> ); >>>> create index feature_genotype_idx1 on feature_genotype = (feature_id); >>>> GRANT ALL on feature_genotype to PUBLIC; >>>> >>>> create index feature_genotype_idx2 on feature_genotype=20 >>>> (genotype_id); >>>> GRANT ALL on feature_genotype to PUBLIC; >>>> >>>> create table environment ( >>>> environment_id serial not NULL, >>>> primary key (environment_id), >>>> uniquename text not null, >>>> description text, >>>> unique(uniquename) >>>> ); >>>> create index environment_idx1 on environment(uniquename); >>>> >>>> >>>> create table env_cvterm ( >>>> env_cvterm_id serial not null, >>>> primary key (env_cvterm_id), >>>> environment_id int not null, >>>> foreign key (environment_id) references environment=20 >>>> (environment_id) >>>> on delete cascade, >>>> cvterm_id int not null, >>>> foreign key (cvterm_id) references cvterm (cvterm_id) on delete >>>> cascade, >>>> >>>> unique(environment_id, cvterm_id) >>>> ); >>>> create index env_cvterm_idx1 on env_cvterm (environment_id); >>>> create index env_cvterm_idx2 on env_cvterm (cvterm_id); >>>> >>>> >>>> create table phenotype ( >>>> phenotype_id serial not null, >>>> primary key (phenotype_id), >>>> uniquename text not null, >>>> observable_id int, >>>> foreign key (observable_id) references cvterm (cvterm_id) on = delete >>>> cascade, >>>> attr_id int, >>>> foreign key (attr_id) references cvterm (cvterm_id) on delete = set >>>> null, >>>> value text, >>>> cvalue_id int, >>>> foreign key (cvalue_id) references cvterm (cvterm_id) on delete = set >>>> null, >>>> assay_id int, >>>> foreign key (assay_id) references cvterm (cvterm_id) on delete = set >>>> null, >>>> unique( uniquename ) >>>> ); >>>> create index phenotype_idx1 on phenotype (cvalue_id); >>>> create index phenotype_idx2 on phenotype (observable_id); >>>> create index phenotype_idx3 on phenotype (attr_id); >>>> >>>> GRANT ALL on phenotype_phenotype_id_seq to PUBLIC; >>>> GRANT ALL on phenotype to PUBLIC; >>>> >>>> create index phenotype_idx1 on phenotype (type_id); >>>> GRANT ALL on phenotype to PUBLIC; >>>> >>>> create index phenotype_idx2 on phenotype (pub_id); >>>> GRANT ALL on phenotype to PUBLIC; >>>> >>>> create index phenotype_idx3 on phenotype (background_genotype_id); >>>> GRANT ALL on phenotype to PUBLIC; >>>> >>>> create table phenotype_cvterm ( >>>> phenotype_cvterm_id serial not null, >>>> primary key (phenotype_cvterm_id), >>>> phenotype_id int not null, >>>> foreign key (phenotype_id) references phenotype (phenotype_id) = on >>>> delete cascade, >>>> cvterm_id int not null, >>>> foreign key (cvterm_id) references cvterm (cvterm_id) on delete >>>> cascade, >>>> unique(phenotype_id, cvterm_id) >>>> ); >>>> >>>> GRANT ALL on phenotype_cvt_phenotype_cvt_seq to PUBLIC; >>>> GRANT ALL on phenotype_cvterm to PUBLIC; >>>> >>>> create index phenotype_cvterm_idx1 on phenotype_cvterm=20 >>>> (phenotype_id); >>>> GRANT ALL on phenotype_cvterm to PUBLIC; >>>> >>>> create index phenotype_cvterm_idx2 on phenotype_cvterm (cvterm_id); >>>> GRANT ALL on phenotype_cvterm to PUBLIC; >>>> >>>> >>>> >>>> create table phenstatement ( >>>> phenstatement_id serial not null, >>>> primary key (phenstatement_id), >>>> genotype_id int not null, >>>> foreign key (genotype_id) references genotype (genotype_id) on=20= >>>> delete >>>> cascade, >>>> environment_id int, >>>> foreign key (environment_id) references environment=20 >>>> (environment_id) >>>> on delete cascade, >>>> phenotype_id int not null, >>>> foreign key (phenotype_id) references phenotype (phenotype_id) = on >>>> delete cascade, >>>> type_id int, >>>> foreign key (type_id) references cvterm (cvterm_id) on delete=20 >>>> cascade, >>>> pub_id int not null, >>>> foreign key (pub_id) references pub (pub_id) on delete cascade, >>>> unique(genotype_id, environment_id, phenotype_id, type_id, = pub_id) >>>> ); >>>> create index phenstatement_idx1 on phenstatement (genotype_env_id); >>>> create index phenstatement_idx2 on phenstatement (phenotype_id); >>>> >>>> >>>> >>>> create table feature_phenotype ( >>>> feature_phenotype_id serial not null, >>>> primary key (feature_phenotype_id), >>>> feature_id int not null, >>>> foreign key (feature_id) references feature (feature_id) on >>>> delete cascade, >>>> phenotype_id int not null, >>>> foreign key (phenotype_id) references phenotype=20 >>>> (phenotype_id) >>>> on delete cascade, >>>> unique(feature_id,phenotype_id) >>>> ); >>>> GRANT ALL on feature_pheno_feature_pheno_seq to PUBLIC; >>>> GRANT ALL on feature_phenotype to PUBLIC; >>>> >>>> create index feature_phenotype_idx1 on feature_phenotype=20 >>>> (feature_id); >>>> GRANT ALL on feature_phenotype to PUBLIC; >>>> >>>> create index feature_phenotype_idx2 on feature_phenotype >>>> (phenotype_id); >>>> GRANT ALL on feature_phenotype to PUBLIC; >>>> >>>> >>>> create table phendesc ( >>>> phendesc_id serial not null, >>>> primary key (phendesc_id), >>>> genotype_id int not null; >>>> foreign key (genotype_id) references genotype (genotype_id)=20= >>>> on >>>> delete cascade, >>>> environment_id int; >>>> foreign key (environment_id) references environment ( >>>> environment_id) on delete cascade, >>>> description text not null, >>>> pub_id int not null, >>>> foreign key (pub_id) references pub (pub_id) on delete cascade, >>>> >>>> unique(genotype_id, environment_id, pub_id) >>>> ); >>>> create index phendesc_idx1 on phendesc (genotype_id); >>>> create index phendesc_idx2 on phendesc (environment_id); >>>> create index phendesc_idx3 on phendesc (pub_id); >>>> >>>> >>>> >>>> create table phenotype_comparison ( >>>> phenotype_comparison_id serial not null, >>>> primary key (phenotype_comparison_id), >>>> genotype1_id int not null, >>>> foreign key (genotype1_id) references genotype = (genotype_id) >>>> on delete cascade, >>>> environment1_id int, >>>> foreign key (environment1_id) references environment >>>> (environment_id) on delete cascade, >>>> genotype2_id int not null, >>>> foreign key (genotype2_id) references genotype = (genotype_id) >>>> on delete cascade, >>>> environment2_id int, >>>> foreign key (environment2_id) references environment >>>> (environment_id) on delete cascade, >>>> phenotype1_id int not null, >>>> foreign key (phenotype1_id) references phenotype >>>> (phenotype_id) on delete cascade, >>>> phenotype2_id int, >>>> foreign key (phenotype2_id) references phenotype >>>> (phenotype_id) on delete cascade, >>>> type_id int not null, >>>> foreign key (type_id) references cvterm (cvterm_id) on=20 >>>> delete >>>> cascade, >>>> pub_id int not null, >>>> foreign key (pub_id) references pub (pub_id) on delete=20 >>>> cascade, >>>> unique(genotype1_id, environment1_id, genotype2_id,=20 >>>> environment2_id, >>>> phenotype1_id, phenotype2_id, >>>> type_id, >>>> pub_id) >>>> ) >>>> >>> Linda Sperling >>> Centre de G=E9n=E9tique Mol=E9culaire >>> CNRS >>> Avenue de la Terrasse >>> 91198 Gif-sur-Yvette CEDEX >>> FRANCE >>> >>> spe...@cg... >>> +33 (0)1 69 82 32 09 (telephone) >>> +33 (0)1 69 82 31 50 (fax) >>> http://paramecium.cgm.cnrs-gif.fr >>> >>> SAUVONS LA RECHERCHE >>> http://recherche-en-danger.apinc.org >>> >> >> >> >> ------------------------------------------------------- >> SF email is sponsored by - The IT Product Guide >> Read honest & candid reviews on hundreds of IT Products from real=20 >> users. >> Discover which products truly live up to the hype. Start reading now. >> http://ads.osdn.com/?ad_ide95&alloc_id=14396&op=3Dclick >> _______________________________________________ >> Gmod-schema mailing list >> Gmo...@li... >> https://lists.sourceforge.net/lists/listinfo/gmod-schema >> >> Linda Sperling Centre de G=E9n=E9tique Mol=E9culaire CNRS Avenue de la Terrasse 91198 Gif-sur-Yvette CEDEX FRANCE spe...@cg... +33 (0)1 69 82 32 09 (telephone) +33 (0)1 69 82 31 50 (fax) http://paramecium.cgm.cnrs-gif.fr SAUVONS LA RECHERCHE http://recherche-en-danger.apinc.org |
|
From: Chris M. <cj...@fr...> - 2005-02-23 22:57:03
|
On Wed, 23 Feb 2005, David Emmert wrote: > Hi All, > > First, I apologize for the shudder of horror which seeing this thread > resurrected in your inbox has probably caused some of you. as long as it's not cvterms as dbxrefs again! > Unfortunately, we need to settle this issue before FlyBase can go > forward with implementing our genetic data in chado, so: deep breath, > center, and exhale... holding breath... > Aubrey and I chatted today, and we had agreed that FlyBase would go > forward with the "denormalization" option of adding a pub_id fkey to > featureprop, and making the unique key be (feature_id, type_id, > pub_id). I was about to write a note announcing the Big News, but did > some exploration of the data we already have in chado_dmel, and > discussed this whole thing with Pinglei, and its clear now that this is > not a workable solution for three reasons: > > 1) Because we already have (many, many) cases in chado_dmel where we > have multiple featureprops for a given feature of the same feature.type > and from a single publication. Some of these originated from apollo, > but we have them from other sources as well. > > 2) Because apollo itself allows multiple featureprops (specifically, of > type "comment"), which are not differentiated by anything like a > publication. So that in order to implement this plan, we would have to > have corresponding changes in apollo, which is, I think, unreasonable to > expect. even with those changes in apollo, i don't see what is unusual about having a single publication with multiple values for a featureprop.type > 3) Because we would prefer the solution we adopt to work also for other > property tables like pubprop, and this solution certainly would not. > That is to say, we can't make a pubprop.pub_id column. > > The other options we discussed were: 1) to adopt the use of functions as > below, and 2) to add a "value_checksum" column, which would contain, for > example, the md5checksum for whatever string is in the value column, and > could be included in the unique index. yikes! > It seemed that the general consensus was that we shouldn't go the > value.checksum route, as the column would have to be not null in order > to be useful as part of any unique key, and we don't want to burden > everyone who would use chado with calculating checksums for every insert > into a -prop table. phew! > So we've decided that our best option is to go forward with the plan > below, using rank in the unique key, and following Chris's advice to put > functions on the server side. This is not beautiful, but it will work. <inhale> > I also think this is a fairly flexible option, since it will allow chado > users who can maintain rank to use chado & XORT in the usual > straight-forward manner, and also let groups like FB, who can't or don't > store property ranks, to use XORT, albeit with the overhead cost of > using those functions. And of course, if people want to use XORT and chado without the update functions, they will be able to do so, provided their ChadoXML is static rather than transactional. In fact even transactional should be fine, if all the values for a featureprop (feature_id,type_id) are considered to be atomic/indivisible, as is often the case for many chado users. > I'd love to hear what anybody has to say about this. If there are any > other options we've missed, please don't be shy about bringing them on! Thanks a lot for taking the time to look at this in depth & summarising it so concisely. > Best, > > -Dave > > On 02/11/05 at 17:35 Chris Mungall wrote: > > > > > > On Fri, 11 Feb 2005, Pinglei Zhou wrote: > > > > > Hi Aubrey, Chris, > > > > > > > > > > > > >Do you mean file formats that don't support attribution? I don't > > see > > > >that as a problem, as such file formats won't expect to support the > > full > > > >richness of FlyBase data. > > > > > > > >> Change XORT > > > > > > > >Right. I think all that remains in this thread is to determine > > whether > > > >there are XORT fixes for this that are sufficiently easy that it's > > OK to > > > >ask Pinglei to implement them. Pinglei is the one to speak to > > that. > > > > > > So what you expect is XORT can deal with some arbitary SQL, right ? > > I > > > just have quick code, I think its doable, see attached sample file > > for > > > reference. here I add tag '_sql', which will be one select stm(if > > > needed, could be multiple stm, eg. a few insertion, delete, than > > LAST > > > one be select stm, separate by ';' ?). so now we have a few ways to > > > specify data:data self, foreign key reference, and SQL result. > > > > > > See the attached sample, try to insert new featureprop which never > > > overwrite old one. One problem I can see here, what happen it return > > > nothing ? > > > > > > Anyway, does this help to solve some of your problem ? If not, what > > we > > > need to improve ? > > > > Hi Pinglei > > > > This looks like a useful addition to xort. Your attached example xml > > deals > > with the use case where we wish to append to a list. I don't think > > your > > sql is adequately constrained - you need to also constrain feature by > > organism_id and the cvterm by cv_id to be safe. > > > > We also have to deal with Aubrey's use case which was providing an > > update/delete for a fp that is additonally constrained by > > featureprop_pub > > > > This can be accommodated by extending your sql: > > > > [i'm using pseudo-sql with * to indicate the appropriate join syntax] > > > > <rank> > > <_sql> > > select max(fp.rank)+1 > > from > > feature f * organism o * cvterm ft * cv ftcv > > featureprop fp * cvterm fptype * cv fptypecv * > > featureprop_pub fpp * pub * cvterm pubtype * cv pubtypecv > > where o.genus='Drosophila' and o.species='melanogaster' > > and f.uniquename='CGxxxx' and fp.name='gene' and > > ftcv.name='sequence' > > and fptype.name='...' and fptypecv.name='props' > > and pub.uniquename='FBrf...' and pubtype.name='..' > > and pubtypecv.name='pub' > > </_sql> > > </_rank> > > > > Wow! > > > > This may seem over the top, but none of the above joins can be safely > > omitted > > > > It seems to go against the whole rationale for chadoxml and xort to > > start > > embedding this in the xml. If Aubrey (or whoever the aubrey equivalent > > is > > in any MOD :) is going to write sql, he may as well write the whole > > transaction set in sql rather than an awkward mix of xml and sql. > > > > What about my solution of calling a function? That way the complex sql > > is > > hidden in the implementation, which removes the need to embed it in > > the > > data producer code and the xml > > > > The spec may look something like this: > > > > -- create a new set of values, or append to existing set > > add_fp(feature_id,type_id,value,pub_id) > > > > -- ditto, but with no provenance data > > add_fp(feature_id,type_id,value) > > > > -- remove a fp > > del_fp(feature_id,type_id,value,pub_id) > > > > -- remove all fps of a specific type from a certain source > > del_fp(feature_id,type_id,pub_id) > > > > chadoxml function calls would then retain the same structure as > > existing > > chadoxml, which seems nice and simple and elegant. Function arguments > > could be macros or expanded nodes. > > > > Of course, someone would still have to write the pgplsql for the > > functions, but this is fairly simple. > > > > I strongly believe that the complex sql belongs on the server side, > > either > > in the DBMS or hidden in a middleware layer. In fact, I'd rather see > > the > > schema denormalized than this (although denormalization doesn't solve > > the > > rank problem, unless we get rid of rank altogether and forget about > > having > > a UC for fp). > > > > There are other possible solutions. For example, an additional > > transform > > between the xml produced by the data producer and the xml that is > > finally > > loaded; xort could allow 'hooks' into server-specific perl code that > > would > > be evaluated on a node. This is essentially the same as the dbms > > functions > > solution, the difference being the code sits in front of the dbms > > rather > > than behind it. > > > > I don't want to seem intransigent or anything - FB should adopt > > whatever > > solution is best for them, other MODs can do it their own way. > > However, > > like Aubrey says, we should focus on getting this right, even if it is > > a > > painful process... > > > > c > > > > > Cheers, > > > > > > > > > Pinglei > > > > |
|
From: David E. <em...@mo...> - 2005-02-23 22:32:44
|
Hi All, First, I apologize for the shudder of horror which seeing this thread resurrected in your inbox has probably caused some of you. Unfortunately, we need to settle this issue before FlyBase can go forward with implementing our genetic data in chado, so: deep breath, center, and exhale... Aubrey and I chatted today, and we had agreed that FlyBase would go forward with the "denormalization" option of adding a pub_id fkey to featureprop, and making the unique key be (feature_id, type_id, pub_id). I was about to write a note announcing the Big News, but did some exploration of the data we already have in chado_dmel, and discussed this whole thing with Pinglei, and its clear now that this is not a workable solution for three reasons: 1) Because we already have (many, many) cases in chado_dmel where we have multiple featureprops for a given feature of the same feature.type and from a single publication. Some of these originated from apollo, but we have them from other sources as well. 2) Because apollo itself allows multiple featureprops (specifically, of type "comment"), which are not differentiated by anything like a publication. So that in order to implement this plan, we would have to have corresponding changes in apollo, which is, I think, unreasonable to expect. 3) Because we would prefer the solution we adopt to work also for other property tables like pubprop, and this solution certainly would not. That is to say, we can't make a pubprop.pub_id column. The other options we discussed were: 1) to adopt the use of functions as below, and 2) to add a "value_checksum" column, which would contain, for example, the md5checksum for whatever string is in the value column, and could be included in the unique index. It seemed that the general consensus was that we shouldn't go the value.checksum route, as the column would have to be not null in order to be useful as part of any unique key, and we don't want to burden everyone who would use chado with calculating checksums for every insert into a -prop table. So we've decided that our best option is to go forward with the plan below, using rank in the unique key, and following Chris's advice to put functions on the server side. This is not beautiful, but it will work. I also think this is a fairly flexible option, since it will allow chado users who can maintain rank to use chado & XORT in the usual straight-forward manner, and also let groups like FB, who can't or don't store property ranks, to use XORT, albeit with the overhead cost of using those functions. I'd love to hear what anybody has to say about this. If there are any other options we've missed, please don't be shy about bringing them on! Best, -Dave On 02/11/05 at 17:35 Chris Mungall wrote: > > > On Fri, 11 Feb 2005, Pinglei Zhou wrote: > > > Hi Aubrey, Chris, > > > > > > > > >Do you mean file formats that don't support attribution? I don't > see > > >that as a problem, as such file formats won't expect to support the > full > > >richness of FlyBase data. > > > > > >> Change XORT > > > > > >Right. I think all that remains in this thread is to determine > whether > > >there are XORT fixes for this that are sufficiently easy that it's > OK to > > >ask Pinglei to implement them. Pinglei is the one to speak to > that. > > > > So what you expect is XORT can deal with some arbitary SQL, right ? > I > > just have quick code, I think its doable, see attached sample file > for > > reference. here I add tag '_sql', which will be one select stm(if > > needed, could be multiple stm, eg. a few insertion, delete, than > LAST > > one be select stm, separate by ';' ?). so now we have a few ways to > > specify data:data self, foreign key reference, and SQL result. > > > > See the attached sample, try to insert new featureprop which never > > overwrite old one. One problem I can see here, what happen it return > > nothing ? > > > > Anyway, does this help to solve some of your problem ? If not, what > we > > need to improve ? > > Hi Pinglei > > This looks like a useful addition to xort. Your attached example xml > deals > with the use case where we wish to append to a list. I don't think > your > sql is adequately constrained - you need to also constrain feature by > organism_id and the cvterm by cv_id to be safe. > > We also have to deal with Aubrey's use case which was providing an > update/delete for a fp that is additonally constrained by > featureprop_pub > > This can be accommodated by extending your sql: > > [i'm using pseudo-sql with * to indicate the appropriate join syntax] > > <rank> > <_sql> > select max(fp.rank)+1 > from > feature f * organism o * cvterm ft * cv ftcv > featureprop fp * cvterm fptype * cv fptypecv * > featureprop_pub fpp * pub * cvterm pubtype * cv pubtypecv > where o.genus='Drosophila' and o.species='melanogaster' > and f.uniquename='CGxxxx' and fp.name='gene' and > ftcv.name='sequence' > and fptype.name='...' and fptypecv.name='props' > and pub.uniquename='FBrf...' and pubtype.name='..' > and pubtypecv.name='pub' > </_sql> > </_rank> > > Wow! > > This may seem over the top, but none of the above joins can be safely > omitted > > It seems to go against the whole rationale for chadoxml and xort to > start > embedding this in the xml. If Aubrey (or whoever the aubrey equivalent > is > in any MOD :) is going to write sql, he may as well write the whole > transaction set in sql rather than an awkward mix of xml and sql. > > What about my solution of calling a function? That way the complex sql > is > hidden in the implementation, which removes the need to embed it in > the > data producer code and the xml > > The spec may look something like this: > > -- create a new set of values, or append to existing set > add_fp(feature_id,type_id,value,pub_id) > > -- ditto, but with no provenance data > add_fp(feature_id,type_id,value) > > -- remove a fp > del_fp(feature_id,type_id,value,pub_id) > > -- remove all fps of a specific type from a certain source > del_fp(feature_id,type_id,pub_id) > > chadoxml function calls would then retain the same structure as > existing > chadoxml, which seems nice and simple and elegant. Function arguments > could be macros or expanded nodes. > > Of course, someone would still have to write the pgplsql for the > functions, but this is fairly simple. > > I strongly believe that the complex sql belongs on the server side, > either > in the DBMS or hidden in a middleware layer. In fact, I'd rather see > the > schema denormalized than this (although denormalization doesn't solve > the > rank problem, unless we get rid of rank altogether and forget about > having > a UC for fp). > > There are other possible solutions. For example, an additional > transform > between the xml produced by the data producer and the xml that is > finally > loaded; xort could allow 'hooks' into server-specific perl code that > would > be evaluated on a node. This is essentially the same as the dbms > functions > solution, the difference being the code sits in front of the dbms > rather > than behind it. > > I don't want to seem intransigent or anything - FB should adopt > whatever > solution is best for them, other MODs can do it their own way. > However, > like Aubrey says, we should focus on getting this right, even if it is > a > painful process... > > c > > > Cheers, > > > > > > Pinglei > > |
|
From: Chris M. <cj...@fr...> - 2005-02-23 17:06:05
|
On Wed, 23 Feb 2005, Pinglei Zhou wrote: > > >Hi Pinglei, > > > >This schema looks good (it solves the unique observable_id problem, and > >much more). But, I still have two questions (maybe it is one > >question): > > > >Where do you put genetic interactions (suppresion, enhancement, etc.)? > > it should involve phenstatement/phenotype_comparison tables. > > >What is the purpose of the phenotype_comparison table? > > phenotype_comparison will replace the old > interaction/interactionobject/interaction/subject table I don't think this is an adequate explanation! The interaction/interactionobject/interaction/subject tables were removed from the genetic module at the end of 2003. I was under the impression that the version of the genetic module that was in cvs was the latest one, and relatively stable. This is the version based around 'gcontext's, and reasonably well commented. It seems that this has been abandoned and there is now a new version. It would therefore seem like a good idea to commit this new one to cvs and to add some comments to it Cheers Chris > Pinglei > > > > > >On Feb 23, 2005, at 4:45 PM, Pinglei Zhou wrote: > > > >> Hi Linda, > >> > >> Not sure if I mentioned before, genetic module is still under > >> development, we (FlyBase) are testing on a new > >> design which is much different from the one in cvs. Here test I means > >> we try to feed Drosophila genetic to see > >> if it will break the schema. welcome to give any suggestions. > >> > >> > >> Pinglei > >> > >> > >> > >> > >> > >> genetic module currently tested by FlyBase > >> > >> create table genotype ( > >> genotype_id serial not null, > >> primary key (genotype_id), > >> uniquename text not null, > >> description varchar(255), > >> unique(uniquename) > >> ); > >> create index genotype_idx1 on genotype(uniquename); > >> GRANT ALL on genotype_genotype_id_seq to PUBLIC; > >> GRANT ALL on genotype to PUBLIC; > >> > >> > >> > >> create table feature_genotype ( > >> =09feature_genotype_id serial not null, > >> =09primary key (feature_genotype_id), > >> =09feature_id int not null, > >> =09foreign key (feature_id) references feature (feature_id) on delete > >> cascade, > >> =09genotype_id int not null, > >> =09foreign key (genotype_id) references genotype (genotype_id) on dele= te > >> cascade, > >> =09chromosome_id int, > >> =09foreign key (chromosome_id) references feature (feature_id) on dele= te > >> set null, > >> =09rank int not null, > >> =09cgroup =09int not null, > >> =09cvterm_id int not null, > >> =09foreign key (cvterm_id) references cvterm (cvterm_id) on delete > >> cascade, > >> =09unique(feature_id, genotype_id, cvterm_id) > >> ); > >> create index feature_genotype_idx1 on feature_genotype (feature_id); > >> GRANT ALL on feature_genotype to PUBLIC; > >> > >> create index feature_genotype_idx2 on feature_genotype (genotype_id); > >> GRANT ALL on feature_genotype to PUBLIC; > >> > >> create table environment ( > >> =09environment_id serial not NULL, > >> =09primary key (environment_id), > >> uniquename text not null, > >> =09description =09text, > >> =09unique(uniquename) > >> ); > >> create index environment_idx1 on environment(uniquename); > >> > >> > >> create table env_cvterm ( > >> =09env_cvterm_id serial not null, > >> =09primary key (env_cvterm_id), > >> =09environment_id =09int not null, > >> =09foreign key (environment_id) references environment (environment_id= ) > >> on delete cascade, > >> =09cvterm_id int not null, > >> =09foreign key (cvterm_id) references cvterm (cvterm_id) on delete > >> cascade, > >> > >> =09unique(environment_id, cvterm_id) > >> ); > >> create index env_cvterm_idx1 on env_cvterm (environment_id); > >> create index env_cvterm_idx2 on env_cvterm (cvterm_id); > >> > >> > >> create table phenotype ( > >> =09phenotype_id serial not null, > >> =09primary key (phenotype_id), > >> uniquename text not null, > >> =09observable_id int, > >> =09foreign key (observable_id) references cvterm (cvterm_id) on delete > >> cascade, > >> =09attr_id int, > >> =09foreign key (attr_id) references cvterm (cvterm_id) on delete set > >> null, > >> =09value text, > >> =09cvalue_id int, > >> =09foreign key (cvalue_id) references cvterm (cvterm_id) on delete set > >> null, > >> =09assay_id int, > >> =09foreign key (assay_id) references cvterm (cvterm_id) on delete set > >> null, > >> =09unique( uniquename ) > >> ); > >> create index phenotype_idx1 on phenotype (cvalue_id); > >> create index phenotype_idx2 on phenotype (observable_id); > >> create index phenotype_idx3 on phenotype (attr_id); > >> > >> GRANT ALL on phenotype_phenotype_id_seq to PUBLIC; > >> GRANT ALL on phenotype to PUBLIC; > >> > >> create index phenotype_idx1 on phenotype (type_id); > >> GRANT ALL on phenotype to PUBLIC; > >> > >> create index phenotype_idx2 on phenotype (pub_id); > >> GRANT ALL on phenotype to PUBLIC; > >> > >> create index phenotype_idx3 on phenotype (background_genotype_id); > >> GRANT ALL on phenotype to PUBLIC; > >> > >> create table phenotype_cvterm ( > >> =09phenotype_cvterm_id serial not null, > >> =09primary key (phenotype_cvterm_id), > >> =09phenotype_id int not null, > >> =09foreign key (phenotype_id) references phenotype (phenotype_id) on > >> delete cascade, > >> =09cvterm_id int not null, > >> =09foreign key (cvterm_id) references cvterm (cvterm_id) on delete > >> cascade, > >> =09unique(phenotype_id, cvterm_id) > >> ); > >> > >> GRANT ALL on phenotype_cvt_phenotype_cvt_seq to PUBLIC; > >> GRANT ALL on phenotype_cvterm to PUBLIC; > >> > >> create index phenotype_cvterm_idx1 on phenotype_cvterm (phenotype_id); > >> GRANT ALL on phenotype_cvterm to PUBLIC; > >> > >> create index phenotype_cvterm_idx2 on phenotype_cvterm (cvterm_id); > >> GRANT ALL on phenotype_cvterm to PUBLIC; > >> > >> > >> > >> create table phenstatement ( > >> =09phenstatement_id serial not null, > >> =09primary key (phenstatement_id), > >> =09genotype_id int not null, > >> =09foreign key (genotype_id) references genotype (genotype_id) on dele= te > >> cascade, > >> =09environment_id int, > >> =09foreign key (environment_id) references environment (environment_id= ) > >> on delete cascade, > >> =09phenotype_id int not null, > >> =09foreign key (phenotype_id) references phenotype (phenotype_id) on > >> delete cascade, > >> =09type_id int, > >> =09foreign key (type_id) references cvterm (cvterm_id) on delete casca= de, > >> =09pub_id int not null, > >> =09foreign key (pub_id) references pub (pub_id) on delete cascade, > >> =09unique(genotype_id, environment_id, phenotype_id, type_id, pub_id) > >> ); > >> create index phenstatement_idx1 on phenstatement (genotype_env_id); > >> create index phenstatement_idx2 on phenstatement (phenotype_id); > >> > >> > >> > >> create table feature_phenotype ( > >> feature_phenotype_id serial not null, > >> primary key (feature_phenotype_id), > >> feature_id int not null, > >> foreign key (feature_id) references feature (feature_id) on > >> delete cascade, > >> phenotype_id int not null, > >> foreign key (phenotype_id) references phenotype (phenotype_id) > >> on delete cascade, > >> unique(feature_id,phenotype_id) > >> ); > >> GRANT ALL on feature_pheno_feature_pheno_seq to PUBLIC; > >> GRANT ALL on feature_phenotype to PUBLIC; > >> > >> create index feature_phenotype_idx1 on feature_phenotype (feature_id); > >> GRANT ALL on feature_phenotype to PUBLIC; > >> > >> create index feature_phenotype_idx2 on feature_phenotype > >> (phenotype_id); > >> GRANT ALL on feature_phenotype to PUBLIC; > >> > >> > >> create table phendesc ( > >> =09phendesc_id serial not null, > >> =09primary key (phendesc_id), > >> =09genotype_id int not null; > >> foreign key (genotype_id) references genotype (genotype_id) on > >> delete cascade, > >> =09environment_id int; > >> foreign key (environment_id) references environment ( > >> environment_id) on delete cascade, > >> =09description text not null, > >> =09pub_id int not null, > >> =09foreign key (pub_id) references pub (pub_id) on delete cascade, > >> > >> =09unique(genotype_id, environment_id, pub_id) > >> ); > >> create index phendesc_idx1 on phendesc (genotype_id); > >> create index phendesc_idx2 on phendesc (environment_id); > >> create index phendesc_idx3 on phendesc (pub_id); > >> > >> > >> > >> create table phenotype_comparison ( > >> =09phenotype_comparison_id serial not null, > >> =09primary key (phenotype_comparison_id), > >> =09genotype1_id int not null, > >> foreign key (genotype1_id) references genotype (genotype_id) > >> on delete cascade, > >> =09environment1_id int, > >> foreign key (environment1_id) references environment > >> (environment_id) on delete cascade, > >> =09genotype2_id int not null, > >> foreign key (genotype2_id) references genotype (genotype_id) > >> on delete cascade, > >> =09environment2_id int, > >> foreign key (environment2_id) references environment > >> (environment_id) on delete cascade, > >> =09phenotype1_id int not null, > >> foreign key (phenotype1_id) references phenotype > >> (phenotype_id) on delete cascade, > >> =09phenotype2_id int, > >> foreign key (phenotype2_id) references phenotype > >> (phenotype_id) on delete cascade, > >> =09type_id int not null, > >> foreign key (type_id) references cvterm (cvterm_id) on delete > >> cascade, > >> =09pub_id =09int not null, > >> foreign key (pub_id) references pub (pub_id) on delete cascade= , > >> =09unique(genotype1_id, environment1_id, genotype2_id, environment2_id= , > >> phenotype1_id, phenotype2_id, > >> type_id, > >> pub_id) > >> ) > >> > >Linda Sperling > >Centre de G=E9n=E9tique Mol=E9culaire > >CNRS > >Avenue de la Terrasse > >91198 Gif-sur-Yvette CEDEX > >FRANCE > > > >spe...@cg... > >+33 (0)1 69 82 32 09 =09(telephone) > >+33 (0)1 69 82 31 50=09(fax) > >http://paramecium.cgm.cnrs-gif.fr > > > >SAUVONS LA RECHERCHE > >http://recherche-en-danger.apinc.org > > > > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_ide95&alloc_id=14396&op=3Dclick > _______________________________________________ > Gmod-schema mailing list > Gmo...@li... > https://lists.sourceforge.net/lists/listinfo/gmod-schema > |
|
From: Pinglei Z. <zh...@mo...> - 2005-02-23 16:51:58
|
=20 >Hi Pinglei, > >This schema looks good (it solves the unique observable_id problem, and=20 >much more). But, I still have two questions (maybe it is one=20 >question): > >Where do you put genetic interactions (suppresion, enhancement, etc.)? it should involve phenstatement/phenotype_comparison tables. >What is the purpose of the phenotype_comparison table? phenotype_comparison will replace the old=20 interaction/interactionobject/interaction/subject table Pinglei > >On Feb 23, 2005, at 4:45 PM, Pinglei Zhou wrote: > >> Hi Linda, >> >> Not sure if I mentioned before, genetic module is still under=20 >> development, we (FlyBase) are testing on a new >> design which is much different from the one in cvs. Here test I means=20 >> we try to feed Drosophila genetic to see >> if it will break the schema. welcome to give any suggestions. >> >> >> Pinglei >> >> >> >> >> >> genetic module currently tested by FlyBase >> >> create table genotype ( >> genotype_id serial not null, >> primary key (genotype_id), >> uniquename text not null, >> description varchar(255), >> unique(uniquename) >> ); >> create index genotype_idx1 on genotype(uniquename); >> GRANT ALL on genotype_genotype_id_seq to PUBLIC; >> GRANT ALL on genotype to PUBLIC; >> >> >> >> create table feature_genotype ( >> =09feature_genotype_id serial not null, >> =09primary key (feature_genotype_id), >> =09feature_id int not null, >> =09foreign key (feature_id) references feature (feature_id) on delete=20 >> cascade, >> =09genotype_id int not null, >> =09foreign key (genotype_id) references genotype (genotype_id) on delete= =20 >> cascade, >> =09chromosome_id int, >> =09foreign key (chromosome_id) references feature (feature_id) on delete= =20 >> set null, >> =09rank int not null, >> =09cgroup =09int not null, >> =09cvterm_id int not null, >> =09foreign key (cvterm_id) references cvterm (cvterm_id) on delete=20 >> cascade, >> =09unique(feature_id, genotype_id, cvterm_id) >> ); >> create index feature_genotype_idx1 on feature_genotype (feature_id); >> GRANT ALL on feature_genotype to PUBLIC; >> >> create index feature_genotype_idx2 on feature_genotype (genotype_id); >> GRANT ALL on feature_genotype to PUBLIC; >> >> create table environment ( >> =09environment_id serial not NULL, >> =09primary key (environment_id), >> uniquename text not null, >> =09description =09text, >> =09unique(uniquename) >> ); >> create index environment_idx1 on environment(uniquename); >> >> >> create table env_cvterm ( >> =09env_cvterm_id serial not null, >> =09primary key (env_cvterm_id), >> =09environment_id =09int not null, >> =09foreign key (environment_id) references environment (environment_id)= =20 >> on delete cascade, >> =09cvterm_id int not null, >> =09foreign key (cvterm_id) references cvterm (cvterm_id) on delete=20 >> cascade, >> >> =09unique(environment_id, cvterm_id) >> ); >> create index env_cvterm_idx1 on env_cvterm (environment_id); >> create index env_cvterm_idx2 on env_cvterm (cvterm_id); >> >> >> create table phenotype ( >> =09phenotype_id serial not null, >> =09primary key (phenotype_id), >> uniquename text not null, >> =09observable_id int, >> =09foreign key (observable_id) references cvterm (cvterm_id) on delete= =20 >> cascade, >> =09attr_id int, >> =09foreign key (attr_id) references cvterm (cvterm_id) on delete set=20 >> null, >> =09value text, >> =09cvalue_id int, >> =09foreign key (cvalue_id) references cvterm (cvterm_id) on delete set= =20 >> null, >> =09assay_id int, >> =09foreign key (assay_id) references cvterm (cvterm_id) on delete set=20 >> null, >> =09unique( uniquename )=09 >> ); >> create index phenotype_idx1 on phenotype (cvalue_id); >> create index phenotype_idx2 on phenotype (observable_id); >> create index phenotype_idx3 on phenotype (attr_id); >> >> GRANT ALL on phenotype_phenotype_id_seq to PUBLIC; >> GRANT ALL on phenotype to PUBLIC; >> >> create index phenotype_idx1 on phenotype (type_id); >> GRANT ALL on phenotype to PUBLIC; >> >> create index phenotype_idx2 on phenotype (pub_id); >> GRANT ALL on phenotype to PUBLIC; >> >> create index phenotype_idx3 on phenotype (background_genotype_id); >> GRANT ALL on phenotype to PUBLIC; >> >> create table phenotype_cvterm ( >> =09phenotype_cvterm_id serial not null, >> =09primary key (phenotype_cvterm_id), >> =09phenotype_id int not null, >> =09foreign key (phenotype_id) references phenotype (phenotype_id) on=20 >> delete cascade, >> =09cvterm_id int not null, >> =09foreign key (cvterm_id) references cvterm (cvterm_id) on delete=20 >> cascade, >> =09unique(phenotype_id, cvterm_id) >> ); >> =09 >> GRANT ALL on phenotype_cvt_phenotype_cvt_seq to PUBLIC; >> GRANT ALL on phenotype_cvterm to PUBLIC; >> >> create index phenotype_cvterm_idx1 on phenotype_cvterm (phenotype_id); >> GRANT ALL on phenotype_cvterm to PUBLIC; >> >> create index phenotype_cvterm_idx2 on phenotype_cvterm (cvterm_id); >> GRANT ALL on phenotype_cvterm to PUBLIC; >> >> >> >> create table phenstatement ( >> =09phenstatement_id serial not null, >> =09primary key (phenstatement_id), >> =09genotype_id int not null, >> =09foreign key (genotype_id) references genotype (genotype_id) on delete= =20 >> cascade, >> =09environment_id int, >> =09foreign key (environment_id) references environment (environment_id)= =20 >> on delete cascade, >> =09phenotype_id int not null, >> =09foreign key (phenotype_id) references phenotype (phenotype_id) on=20 >> delete cascade, >> =09type_id int, >> =09foreign key (type_id) references cvterm (cvterm_id) on delete cascade= , >> =09pub_id int not null, >> =09foreign key (pub_id) references pub (pub_id) on delete cascade, >> =09unique(genotype_id, environment_id, phenotype_id, type_id, pub_id)=09 >> ); >> create index phenstatement_idx1 on phenstatement (genotype_env_id); >> create index phenstatement_idx2 on phenstatement (phenotype_id); >> >> >> >> create table feature_phenotype ( >> feature_phenotype_id serial not null, >> primary key (feature_phenotype_id), >> feature_id int not null, >> foreign key (feature_id) references feature (feature_id) on=20 >> delete cascade, >> phenotype_id int not null, >> foreign key (phenotype_id) references phenotype (phenotype_id)=20 >> on delete cascade, >> unique(feature_id,phenotype_id) >> ); >> GRANT ALL on feature_pheno_feature_pheno_seq to PUBLIC; >> GRANT ALL on feature_phenotype to PUBLIC; >> >> create index feature_phenotype_idx1 on feature_phenotype (feature_id); >> GRANT ALL on feature_phenotype to PUBLIC; >> >> create index feature_phenotype_idx2 on feature_phenotype=20 >> (phenotype_id); >> GRANT ALL on feature_phenotype to PUBLIC; >> >> >> create table phendesc ( >> =09phendesc_id serial not null, >> =09primary key (phendesc_id), >> =09genotype_id int not null; >> foreign key (genotype_id) references genotype (genotype_id) on= =20 >> delete cascade, >> =09environment_id int; >> foreign key (environment_id) references environment (=20 >> environment_id) on delete cascade, >> =09description text not null, >> =09pub_id int not null, >> =09foreign key (pub_id) references pub (pub_id) on delete cascade, >> >> =09unique(genotype_id, environment_id, pub_id) >> ); >> create index phendesc_idx1 on phendesc (genotype_id); >> create index phendesc_idx2 on phendesc (environment_id); >> create index phendesc_idx3 on phendesc (pub_id); >> >> >> >> create table phenotype_comparison ( >> =09phenotype_comparison_id serial not null, >> =09primary key (phenotype_comparison_id), >> =09genotype1_id int not null, >> foreign key (genotype1_id) references genotype (genotype_id)=20 >> on delete cascade, >> =09environment1_id int, >> foreign key (environment1_id) references environment=20 >> (environment_id) on delete cascade, >> =09genotype2_id int not null, >> foreign key (genotype2_id) references genotype (genotype_id)=20 >> on delete cascade, >> =09environment2_id int, >> foreign key (environment2_id) references environment=20 >> (environment_id) on delete cascade, >> =09phenotype1_id int not null, >> foreign key (phenotype1_id) references phenotype=20 >> (phenotype_id) on delete cascade, >> =09phenotype2_id int, >> foreign key (phenotype2_id) references phenotype=20 >> (phenotype_id) on delete cascade, >> =09type_id int not null, >> foreign key (type_id) references cvterm (cvterm_id) on delete=20 >> cascade, >> =09pub_id =09int not null, >> foreign key (pub_id) references pub (pub_id) on delete cascade, >> =09unique(genotype1_id, environment1_id, genotype2_id, environment2_id,= =20 >> phenotype1_id, phenotype2_id, >> type_id, >> pub_id) >> ) >> >Linda Sperling >Centre de G=E9n=E9tique Mol=E9culaire >CNRS >Avenue de la Terrasse >91198 Gif-sur-Yvette CEDEX >FRANCE > >spe...@cg... >+33 (0)1 69 82 32 09 =09(telephone) >+33 (0)1 69 82 31 50=09(fax) >http://paramecium.cgm.cnrs-gif.fr > >SAUVONS LA RECHERCHE >http://recherche-en-danger.apinc.org > |
|
From: Linda S. <Lin...@cg...> - 2005-02-23 16:38:08
|
Hi Pinglei, This schema looks good (it solves the unique observable_id problem, and=20= much more). But, I still have two questions (maybe it is one=20 question): Where do you put genetic interactions (suppresion, enhancement, etc.)? What is the purpose of the phenotype_comparison table? Thank you, Linda On Feb 23, 2005, at 4:45 PM, Pinglei Zhou wrote: > Hi Linda, > > Not sure if I mentioned before, genetic module is still under=20 > development, we (FlyBase) are testing on a new > design which is much different from the one in cvs. Here test I means=20= > we try to feed Drosophila genetic to see > if it will break the schema. welcome to give any suggestions. > > > Pinglei > > > > > > genetic module currently tested by FlyBase > > create table genotype ( > genotype_id serial not null, > primary key (genotype_id), > uniquename text not null, > description varchar(255), > unique(uniquename) > ); > create index genotype_idx1 on genotype(uniquename); > GRANT ALL on genotype_genotype_id_seq to PUBLIC; > GRANT ALL on genotype to PUBLIC; > > > > create table feature_genotype ( > feature_genotype_id serial not null, > primary key (feature_genotype_id), > feature_id int not null, > foreign key (feature_id) references feature (feature_id) on = delete=20 > cascade, > genotype_id int not null, > foreign key (genotype_id) references genotype (genotype_id) on = delete=20 > cascade, > chromosome_id int, > foreign key (chromosome_id) references feature (feature_id) on = delete=20 > set null, > rank int not null, > cgroup int not null, > cvterm_id int not null, > foreign key (cvterm_id) references cvterm (cvterm_id) on delete=20= > cascade, > unique(feature_id, genotype_id, cvterm_id) > ); > create index feature_genotype_idx1 on feature_genotype (feature_id); > GRANT ALL on feature_genotype to PUBLIC; > > create index feature_genotype_idx2 on feature_genotype (genotype_id); > GRANT ALL on feature_genotype to PUBLIC; > > create table environment ( > environment_id serial not NULL, > primary key (environment_id), > uniquename text not null, > description text, > unique(uniquename) > ); > create index environment_idx1 on environment(uniquename); > > > create table env_cvterm ( > env_cvterm_id serial not null, > primary key (env_cvterm_id), > environment_id int not null, > foreign key (environment_id) references environment = (environment_id)=20 > on delete cascade, > cvterm_id int not null, > foreign key (cvterm_id) references cvterm (cvterm_id) on delete=20= > cascade, > > unique(environment_id, cvterm_id) > ); > create index env_cvterm_idx1 on env_cvterm (environment_id); > create index env_cvterm_idx2 on env_cvterm (cvterm_id); > > > create table phenotype ( > phenotype_id serial not null, > primary key (phenotype_id), > uniquename text not null, > observable_id int, > foreign key (observable_id) references cvterm (cvterm_id) on = delete=20 > cascade, > attr_id int, > foreign key (attr_id) references cvterm (cvterm_id) on delete = set=20 > null, > value text, > cvalue_id int, > foreign key (cvalue_id) references cvterm (cvterm_id) on delete = set=20 > null, > assay_id int, > foreign key (assay_id) references cvterm (cvterm_id) on delete = set=20 > null, > unique( uniquename )=09 > ); > create index phenotype_idx1 on phenotype (cvalue_id); > create index phenotype_idx2 on phenotype (observable_id); > create index phenotype_idx3 on phenotype (attr_id); > > GRANT ALL on phenotype_phenotype_id_seq to PUBLIC; > GRANT ALL on phenotype to PUBLIC; > > create index phenotype_idx1 on phenotype (type_id); > GRANT ALL on phenotype to PUBLIC; > > create index phenotype_idx2 on phenotype (pub_id); > GRANT ALL on phenotype to PUBLIC; > > create index phenotype_idx3 on phenotype (background_genotype_id); > GRANT ALL on phenotype to PUBLIC; > > create table phenotype_cvterm ( > phenotype_cvterm_id serial not null, > primary key (phenotype_cvterm_id), > phenotype_id int not null, > foreign key (phenotype_id) references phenotype (phenotype_id) = on=20 > delete cascade, > cvterm_id int not null, > foreign key (cvterm_id) references cvterm (cvterm_id) on delete=20= > cascade, > unique(phenotype_id, cvterm_id) > ); > =09 > GRANT ALL on phenotype_cvt_phenotype_cvt_seq to PUBLIC; > GRANT ALL on phenotype_cvterm to PUBLIC; > > create index phenotype_cvterm_idx1 on phenotype_cvterm (phenotype_id); > GRANT ALL on phenotype_cvterm to PUBLIC; > > create index phenotype_cvterm_idx2 on phenotype_cvterm (cvterm_id); > GRANT ALL on phenotype_cvterm to PUBLIC; > > > > create table phenstatement ( > phenstatement_id serial not null, > primary key (phenstatement_id), > genotype_id int not null, > foreign key (genotype_id) references genotype (genotype_id) on = delete=20 > cascade, > environment_id int, > foreign key (environment_id) references environment = (environment_id)=20 > on delete cascade, > phenotype_id int not null, > foreign key (phenotype_id) references phenotype (phenotype_id) = on=20 > delete cascade, > type_id int, > foreign key (type_id) references cvterm (cvterm_id) on delete = cascade, > pub_id int not null, > foreign key (pub_id) references pub (pub_id) on delete cascade, > unique(genotype_id, environment_id, phenotype_id, type_id, = pub_id)=09 > ); > create index phenstatement_idx1 on phenstatement (genotype_env_id); > create index phenstatement_idx2 on phenstatement (phenotype_id); > > > > create table feature_phenotype ( > feature_phenotype_id serial not null, > primary key (feature_phenotype_id), > feature_id int not null, > foreign key (feature_id) references feature (feature_id) on=20 > delete cascade, > phenotype_id int not null, > foreign key (phenotype_id) references phenotype (phenotype_id)=20= > on delete cascade, > unique(feature_id,phenotype_id) > ); > GRANT ALL on feature_pheno_feature_pheno_seq to PUBLIC; > GRANT ALL on feature_phenotype to PUBLIC; > > create index feature_phenotype_idx1 on feature_phenotype (feature_id); > GRANT ALL on feature_phenotype to PUBLIC; > > create index feature_phenotype_idx2 on feature_phenotype=20 > (phenotype_id); > GRANT ALL on feature_phenotype to PUBLIC; > > > create table phendesc ( > phendesc_id serial not null, > primary key (phendesc_id), > genotype_id int not null; > foreign key (genotype_id) references genotype (genotype_id) on=20= > delete cascade, > environment_id int; > foreign key (environment_id) references environment (=20 > environment_id) on delete cascade, > description text not null, > pub_id int not null, > foreign key (pub_id) references pub (pub_id) on delete cascade, > > unique(genotype_id, environment_id, pub_id) > ); > create index phendesc_idx1 on phendesc (genotype_id); > create index phendesc_idx2 on phendesc (environment_id); > create index phendesc_idx3 on phendesc (pub_id); > > > > create table phenotype_comparison ( > phenotype_comparison_id serial not null, > primary key (phenotype_comparison_id), > genotype1_id int not null, > foreign key (genotype1_id) references genotype (genotype_id)=20= > on delete cascade, > environment1_id int, > foreign key (environment1_id) references environment=20 > (environment_id) on delete cascade, > genotype2_id int not null, > foreign key (genotype2_id) references genotype (genotype_id)=20= > on delete cascade, > environment2_id int, > foreign key (environment2_id) references environment=20 > (environment_id) on delete cascade, > phenotype1_id int not null, > foreign key (phenotype1_id) references phenotype=20 > (phenotype_id) on delete cascade, > phenotype2_id int, > foreign key (phenotype2_id) references phenotype=20 > (phenotype_id) on delete cascade, > type_id int not null, > foreign key (type_id) references cvterm (cvterm_id) on delete=20= > cascade, > pub_id int not null, > foreign key (pub_id) references pub (pub_id) on delete = cascade, > unique(genotype1_id, environment1_id, genotype2_id, = environment2_id,=20 > phenotype1_id, phenotype2_id, > type_id, > pub_id) > ) > Linda Sperling Centre de G=E9n=E9tique Mol=E9culaire CNRS Avenue de la Terrasse 91198 Gif-sur-Yvette CEDEX FRANCE spe...@cg... +33 (0)1 69 82 32 09 (telephone) +33 (0)1 69 82 31 50 (fax) http://paramecium.cgm.cnrs-gif.fr SAUVONS LA RECHERCHE http://recherche-en-danger.apinc.org |
|
From: Pinglei Z. <zh...@mo...> - 2005-02-23 15:46:05
|
Hi Linda,
Not sure if I mentioned before, genetic module is still under development, we (FlyBase) are testing on a new
design which is much different from the one in cvs. Here test I means we try to feed Drosophila genetic to see
if it will break the schema. welcome to give any suggestions.
Pinglei
genetic module currently tested by FlyBase
create table genotype (
genotype_id serial not null,
primary key (genotype_id),
uniquename text not null,
description varchar(255),
unique(uniquename)
);
create index genotype_idx1 on genotype(uniquename);
GRANT ALL on genotype_genotype_id_seq to PUBLIC;
GRANT ALL on genotype to PUBLIC;
create table feature_genotype (
feature_genotype_id serial not null,
primary key (feature_genotype_id),
feature_id int not null,
foreign key (feature_id) references feature (feature_id) on delete cascade,
genotype_id int not null,
foreign key (genotype_id) references genotype (genotype_id) on delete cascade,
chromosome_id int,
foreign key (chromosome_id) references feature (feature_id) on delete set null,
rank int not null,
cgroup int not null,
cvterm_id int not null,
foreign key (cvterm_id) references cvterm (cvterm_id) on delete cascade,
unique(feature_id, genotype_id, cvterm_id)
);
create index feature_genotype_idx1 on feature_genotype (feature_id);
GRANT ALL on feature_genotype to PUBLIC;
create index feature_genotype_idx2 on feature_genotype (genotype_id);
GRANT ALL on feature_genotype to PUBLIC;
create table environment (
environment_id serial not NULL,
primary key (environment_id),
uniquename text not null,
description text,
unique(uniquename)
);
create index environment_idx1 on environment(uniquename);
create table env_cvterm (
env_cvterm_id serial not null,
primary key (env_cvterm_id),
environment_id int not null,
foreign key (environment_id) references environment (environment_id) on delete cascade,
cvterm_id int not null,
foreign key (cvterm_id) references cvterm (cvterm_id) on delete cascade,
unique(environment_id, cvterm_id)
);
create index env_cvterm_idx1 on env_cvterm (environment_id);
create index env_cvterm_idx2 on env_cvterm (cvterm_id);
create table phenotype (
phenotype_id serial not null,
primary key (phenotype_id),
uniquename text not null,
observable_id int,
foreign key (observable_id) references cvterm (cvterm_id) on delete cascade,
attr_id int,
foreign key (attr_id) references cvterm (cvterm_id) on delete set null,
value text,
cvalue_id int,
foreign key (cvalue_id) references cvterm (cvterm_id) on delete set null,
assay_id int,
foreign key (assay_id) references cvterm (cvterm_id) on delete set null,
unique( uniquename )
);
create index phenotype_idx1 on phenotype (cvalue_id);
create index phenotype_idx2 on phenotype (observable_id);
create index phenotype_idx3 on phenotype (attr_id);
GRANT ALL on phenotype_phenotype_id_seq to PUBLIC;
GRANT ALL on phenotype to PUBLIC;
create index phenotype_idx1 on phenotype (type_id);
GRANT ALL on phenotype to PUBLIC;
create index phenotype_idx2 on phenotype (pub_id);
GRANT ALL on phenotype to PUBLIC;
create index phenotype_idx3 on phenotype (background_genotype_id);
GRANT ALL on phenotype to PUBLIC;
create table phenotype_cvterm (
phenotype_cvterm_id serial not null,
primary key (phenotype_cvterm_id),
phenotype_id int not null,
foreign key (phenotype_id) references phenotype (phenotype_id) on delete cascade,
cvterm_id int not null,
foreign key (cvterm_id) references cvterm (cvterm_id) on delete cascade,
unique(phenotype_id, cvterm_id)
);
GRANT ALL on phenotype_cvt_phenotype_cvt_seq to PUBLIC;
GRANT ALL on phenotype_cvterm to PUBLIC;
create index phenotype_cvterm_idx1 on phenotype_cvterm (phenotype_id);
GRANT ALL on phenotype_cvterm to PUBLIC;
create index phenotype_cvterm_idx2 on phenotype_cvterm (cvterm_id);
GRANT ALL on phenotype_cvterm to PUBLIC;
create table phenstatement (
phenstatement_id serial not null,
primary key (phenstatement_id),
genotype_id int not null,
foreign key (genotype_id) references genotype (genotype_id) on delete cascade,
environment_id int,
foreign key (environment_id) references environment (environment_id) on delete cascade,
phenotype_id int not null,
foreign key (phenotype_id) references phenotype (phenotype_id) on delete cascade,
type_id int,
foreign key (type_id) references cvterm (cvterm_id) on delete cascade,
pub_id int not null,
foreign key (pub_id) references pub (pub_id) on delete cascade,
unique(genotype_id, environment_id, phenotype_id, type_id, pub_id)
);
create index phenstatement_idx1 on phenstatement (genotype_env_id);
create index phenstatement_idx2 on phenstatement (phenotype_id);
create table feature_phenotype (
feature_phenotype_id serial not null,
primary key (feature_phenotype_id),
feature_id int not null,
foreign key (feature_id) references feature (feature_id) on delete cascade,
phenotype_id int not null,
foreign key (phenotype_id) references phenotype (phenotype_id) on delete cascade,
unique(feature_id,phenotype_id)
);
GRANT ALL on feature_pheno_feature_pheno_seq to PUBLIC;
GRANT ALL on feature_phenotype to PUBLIC;
create index feature_phenotype_idx1 on feature_phenotype (feature_id);
GRANT ALL on feature_phenotype to PUBLIC;
create index feature_phenotype_idx2 on feature_phenotype (phenotype_id);
GRANT ALL on feature_phenotype to PUBLIC;
create table phendesc (
phendesc_id serial not null,
primary key (phendesc_id),
genotype_id int not null;
foreign key (genotype_id) references genotype (genotype_id) on delete cascade,
environment_id int;
foreign key (environment_id) references environment ( environment_id) on delete cascade,
description text not null,
pub_id int not null,
foreign key (pub_id) references pub (pub_id) on delete cascade,
unique(genotype_id, environment_id, pub_id)
);
create index phendesc_idx1 on phendesc (genotype_id);
create index phendesc_idx2 on phendesc (environment_id);
create index phendesc_idx3 on phendesc (pub_id);
create table phenotype_comparison (
phenotype_comparison_id serial not null,
primary key (phenotype_comparison_id),
genotype1_id int not null,
foreign key (genotype1_id) references genotype (genotype_id) on delete cascade,
environment1_id int,
foreign key (environment1_id) references environment (environment_id) on delete cascade,
genotype2_id int not null,
foreign key (genotype2_id) references genotype (genotype_id) on delete cascade,
environment2_id int,
foreign key (environment2_id) references environment (environment_id) on delete cascade,
phenotype1_id int not null,
foreign key (phenotype1_id) references phenotype (phenotype_id) on delete cascade,
phenotype2_id int,
foreign key (phenotype2_id) references phenotype (phenotype_id) on delete cascade,
type_id int not null,
foreign key (type_id) references cvterm (cvterm_id) on delete cascade,
pub_id int not null,
foreign key (pub_id) references pub (pub_id) on delete cascade,
unique(genotype1_id, environment1_id, genotype2_id, environment2_id, phenotype1_id, phenotype2_id,
type_id,
pub_id)
)
|
|
From: Linda S. <Lin...@cg...> - 2005-02-23 15:28:32
|
Hello, I have a question about the phenstatement table of the genetic module. Since the phenstatement is a "single atomic phenotypic observation", I=20= do not understand why the unique constraint is on "gcontext_id,=20 dbxref_id, observable_id"! First, what does dbxref refer to? The main problem, is the unique constraint on the observable_id. For example, I need to model a paramecium mutant that has the following=20= phenotypes (hence two phenstatements for one gcontext) : Observable =3D cell, Attribute =3D size, Value =3D SmallSize Observable =3D cell, Attribute =3D shape, Value =3D Abnormal This apparently would violate the unique constraint on the=20 observable_id. If attribute_id were also included in the unique constraint, then it=20 would be OK. Regards, Linda Sperling Linda Sperling Centre de G=E9n=E9tique Mol=E9culaire CNRS Avenue de la Terrasse 91198 Gif-sur-Yvette CEDEX FRANCE spe...@cg... +33 (0)1 69 82 32 09 (telephone) +33 (0)1 69 82 31 50 (fax) http://paramecium.cgm.cnrs-gif.fr SAUVONS LA RECHERCHE http://recherche-en-danger.apinc.org= |
|
From: Chris M. <cj...@fr...> - 2005-02-22 21:17:55
|
On Tue, 22 Feb 2005, Scott Cain wrote: > On Tue, 2005-02-22 at 11:00 -0800, Chris Mungall wrote: > > > > On Tue, 22 Feb 2005, Scott Cain wrote: > > > > > On Tue, 2005-02-22 at 08:13 -0800, Chris Mungall wrote: > > > > > > > > > > > > A cv.name should never be "Ad hoc" because there will be collisions > > > > > > between cvs > > > > > > > > > > Fair enough. Would you prefer 'local'? > > > > > > > > Sorry, I was being cryptic. What I meant was, even these "ad hoc" > > > > ontologies must have some kind of name that communicates the nature of the > > > > cvterms within them. It seemed like you were planning on lumping all > > > > ad-hoc ontologies together, which has a high likelihood of producing > > > > collisions on cvterm unique keys. > > > > > > > > It sounds like "Ad hoc:synonym" is actually from a cv of property types > > > > that can be attached using featureprop et al. This is very definitely not > > > > an ad-hoc ontology, it's crucial that featureprop types have their own cv > > > > and are defined > > > > > > Actually, I only have a few things in 'local'. They are things that are > > > fundamental to making chado work (or the gff loader work), like score > > > and synonym. There are several other ad hoc ontologies that have more > > > > There is nothing gff-specific about the concept of 'score'. There needs to > > be an ontology of score types (possibly as part of a larger statistical > > term cv). The score column in a gff maps to the most generic score term in > > this ontology, since gff doesn't let you distinguish between score types. > > > > Clearly this ontology won't materialize overnight, so I suggest for now > > you create an on-the-fly cv called "score" or "program_output", with one > > term in it called "score". If anyone is up for it we can create a mini-obo > > file with a few more specific terms in it. > > Well, the real problem with score is that it ought to map to somewhere > in analysisfeature, but we can't do that because we don't know what kind > of score it is. If we scrapped the current analysisfeature and replaced > it with a table that has a score and score_type column (for example), > this would work better, because then we could do as you suggest and just > give GFF scores the most generic form of score, which would allow users > to go back and alter the type if they know (or provide it on the command > line when loading a given file). The problem with this is that, in > order for it to stay normalized, the score would have to go in a new > table (analysisfeature_prop?), since you could have multiple types of > scores for a given analysisfeature (which is presumably why there are > multiple columns in analysysfeature to begin with). The gffscore corresponds to analysisfeature.rawscore - as the docs state, this is the "native" scoring system used by a program. There are no semantics imposed on it. If you happen to know that the score type is a bitscore, then you can populate this column too. the idea is that the analysisfeature score types would map to suitable upper-level terms in some yet-to-be-defined program/program output ontology the analysisfeature.score columns are something of an exception to the whole chado design philosophy. To be consistent, we really should have just used featureprops. However, the ability to have the scores available as floats to sql queries is incredibly useful. one could perhaps argue the same for other kinds of featureprop another way we could have done this is to have had different featureprop tables at the physical layer: featureprop_float, featureprop_int, featureprop_text, etc. The logical layer would provide one relation: featureprop - this would be a view or materialized view over the underlying featureprop_<type> tables. Applications that required sql ordering over numeric featureprop values and such could cut beneath the main presentation layer and go to the optimisation layer. In fact there is nothing to stop anyone instantiating their own chado in this way (they'd have to write trigger code and views or view-materializers of course). It could be implemented in the converse direction, with materialized views such as featureprop_float over featureprop. this was the way chado was always meant to be: a simple, generic top layer, with other presentation layers available given that the dbms code to do all this is still at the TODO/alpha stage the analysisfeature.score columns are a reasonable if ugly compromise between genericness and utility > > > descriptive names like 'property type', and 'Statistical terms'. I > > > think this arrangement makes as much sense as FlyBase chado's use of the > > > synonym type ontology with exactly one term in it: synonym. > > > > GO has typed synonyms (exact, narrower_than, etc). All inherit from a > > generic "synonym" which corresponds to the cvterm that is in the current > > fb chado. I can generate an obo file of synonym types for you if you like. > > > > cvterm is really not intended as a dumping ground for homeless strings. > > The whole point of using cvterm for things such as feature types as well > > as things like GO is to allow the chado model to be extensible, > > interoperable, well-defined etc. > > > > > Darn it, as soon as I wrote that last sentence, I was reminded of how > > > it's not true: Lincoln was complaining to me a few months ago about > > > WormBase's lack of typed synonyms. He said it would be nice to have > > > 'GenBank synonym', 'Swissprot synonym', etc. I happily pointed out that > > > chado could easily do that. > > > > I'm not sure if I see these as synonym types. We'd want a way to use the > > db table here. Can you give an example of a GenBank synonym? Things such > > as genbank qualifiers and genbank feature types would go in a genbank cv. > > Well, I'm not a WormBase person, but I am reasonably sure what he was > referring to was the fact the they need to tie accession numbers from > different databases to a given feature in WormBase. I think you are > probably right about how to do it in chado, though, since those > accessions should be tied to a feature via feature_dbxref. yep > > I'm convinced all these so-called ad-hoc CVs can be given homes that will > > later mature into full-fledged well-defined stable CVs > > That is no doubt true--and my putting them in 'local' for the time being > doesn't really cause any long term problems (like by early choices > getting calcified in place--it seems to me that these terms can be > fairly fluid for some time to come without much harm). ok fair enough - but calcification can happen quicker than you'd imagine. if i want to write code that generates a feature report including certain featureprops, then i'd have to hardcode either the cv.name,cvterm.name or the dbxref - before you know it you can't modify the cvterms without fearing you'll break code... i see the featureprop cvterms et al as the second schema layer. it's more flexible than the main relational layer, but even so, you're changing the schema if you change the featureprop cvterms > > > Scott > > > > > > > > > |
|
From: Scott C. <ca...@cs...> - 2005-02-22 19:40:42
|
On Tue, 2005-02-22 at 11:00 -0800, Chris Mungall wrote: > > On Tue, 22 Feb 2005, Scott Cain wrote: > > > On Tue, 2005-02-22 at 08:13 -0800, Chris Mungall wrote: > > > > > > > > > > A cv.name should never be "Ad hoc" because there will be collisions > > > > > between cvs > > > > > > > > Fair enough. Would you prefer 'local'? > > > > > > Sorry, I was being cryptic. What I meant was, even these "ad hoc" > > > ontologies must have some kind of name that communicates the nature of the > > > cvterms within them. It seemed like you were planning on lumping all > > > ad-hoc ontologies together, which has a high likelihood of producing > > > collisions on cvterm unique keys. > > > > > > It sounds like "Ad hoc:synonym" is actually from a cv of property types > > > that can be attached using featureprop et al. This is very definitely not > > > an ad-hoc ontology, it's crucial that featureprop types have their own cv > > > and are defined > > > > Actually, I only have a few things in 'local'. They are things that are > > fundamental to making chado work (or the gff loader work), like score > > and synonym. There are several other ad hoc ontologies that have more > > There is nothing gff-specific about the concept of 'score'. There needs to > be an ontology of score types (possibly as part of a larger statistical > term cv). The score column in a gff maps to the most generic score term in > this ontology, since gff doesn't let you distinguish between score types. > > Clearly this ontology won't materialize overnight, so I suggest for now > you create an on-the-fly cv called "score" or "program_output", with one > term in it called "score". If anyone is up for it we can create a mini-obo > file with a few more specific terms in it. Well, the real problem with score is that it ought to map to somewhere in analysisfeature, but we can't do that because we don't know what kind of score it is. If we scrapped the current analysisfeature and replaced it with a table that has a score and score_type column (for example), this would work better, because then we could do as you suggest and just give GFF scores the most generic form of score, which would allow users to go back and alter the type if they know (or provide it on the command line when loading a given file). The problem with this is that, in order for it to stay normalized, the score would have to go in a new table (analysisfeature_prop?), since you could have multiple types of scores for a given analysisfeature (which is presumably why there are multiple columns in analysysfeature to begin with). > > > descriptive names like 'property type', and 'Statistical terms'. I > > think this arrangement makes as much sense as FlyBase chado's use of the > > synonym type ontology with exactly one term in it: synonym. > > GO has typed synonyms (exact, narrower_than, etc). All inherit from a > generic "synonym" which corresponds to the cvterm that is in the current > fb chado. I can generate an obo file of synonym types for you if you like. > > cvterm is really not intended as a dumping ground for homeless strings. > The whole point of using cvterm for things such as feature types as well > as things like GO is to allow the chado model to be extensible, > interoperable, well-defined etc. > > > Darn it, as soon as I wrote that last sentence, I was reminded of how > > it's not true: Lincoln was complaining to me a few months ago about > > WormBase's lack of typed synonyms. He said it would be nice to have > > 'GenBank synonym', 'Swissprot synonym', etc. I happily pointed out that > > chado could easily do that. > > I'm not sure if I see these as synonym types. We'd want a way to use the > db table here. Can you give an example of a GenBank synonym? Things such > as genbank qualifiers and genbank feature types would go in a genbank cv. Well, I'm not a WormBase person, but I am reasonably sure what he was referring to was the fact the they need to tie accession numbers from different databases to a given feature in WormBase. I think you are probably right about how to do it in chado, though, since those accessions should be tied to a feature via feature_dbxref. > > I'm convinced all these so-called ad-hoc CVs can be given homes that will > later mature into full-fledged well-defined stable CVs That is no doubt true--and my putting them in 'local' for the time being doesn't really cause any long term problems (like by early choices getting calcified in place--it seems to me that these terms can be fairly fluid for some time to come without much harm). > > > Scott > > > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. ca...@cs... GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory |
|
From: Chris M. <cj...@fr...> - 2005-02-22 19:00:20
|
On Tue, 22 Feb 2005, Scott Cain wrote: > On Tue, 2005-02-22 at 08:13 -0800, Chris Mungall wrote: > > > > > > > > A cv.name should never be "Ad hoc" because there will be collisions > > > > between cvs > > > > > > Fair enough. Would you prefer 'local'? > > > > Sorry, I was being cryptic. What I meant was, even these "ad hoc" > > ontologies must have some kind of name that communicates the nature of the > > cvterms within them. It seemed like you were planning on lumping all > > ad-hoc ontologies together, which has a high likelihood of producing > > collisions on cvterm unique keys. > > > > It sounds like "Ad hoc:synonym" is actually from a cv of property types > > that can be attached using featureprop et al. This is very definitely not > > an ad-hoc ontology, it's crucial that featureprop types have their own cv > > and are defined > > Actually, I only have a few things in 'local'. They are things that are > fundamental to making chado work (or the gff loader work), like score > and synonym. There are several other ad hoc ontologies that have more There is nothing gff-specific about the concept of 'score'. There needs to be an ontology of score types (possibly as part of a larger statistical term cv). The score column in a gff maps to the most generic score term in this ontology, since gff doesn't let you distinguish between score types. Clearly this ontology won't materialize overnight, so I suggest for now you create an on-the-fly cv called "score" or "program_output", with one term in it called "score". If anyone is up for it we can create a mini-obo file with a few more specific terms in it. > descriptive names like 'property type', and 'Statistical terms'. I > think this arrangement makes as much sense as FlyBase chado's use of the > synonym type ontology with exactly one term in it: synonym. GO has typed synonyms (exact, narrower_than, etc). All inherit from a generic "synonym" which corresponds to the cvterm that is in the current fb chado. I can generate an obo file of synonym types for you if you like. cvterm is really not intended as a dumping ground for homeless strings. The whole point of using cvterm for things such as feature types as well as things like GO is to allow the chado model to be extensible, interoperable, well-defined etc. > Darn it, as soon as I wrote that last sentence, I was reminded of how > it's not true: Lincoln was complaining to me a few months ago about > WormBase's lack of typed synonyms. He said it would be nice to have > 'GenBank synonym', 'Swissprot synonym', etc. I happily pointed out that > chado could easily do that. I'm not sure if I see these as synonym types. We'd want a way to use the db table here. Can you give an example of a GenBank synonym? Things such as genbank qualifiers and genbank feature types would go in a genbank cv. I'm convinced all these so-called ad-hoc CVs can be given homes that will later mature into full-fledged well-defined stable CVs > Scott > > |
|
From: Scott C. <ca...@cs...> - 2005-02-22 18:01:13
|
On Tue, 2005-02-22 at 08:13 -0800, Chris Mungall wrote: > > > > > > A cv.name should never be "Ad hoc" because there will be collisions > > > between cvs > > > > Fair enough. Would you prefer 'local'? > > Sorry, I was being cryptic. What I meant was, even these "ad hoc" > ontologies must have some kind of name that communicates the nature of the > cvterms within them. It seemed like you were planning on lumping all > ad-hoc ontologies together, which has a high likelihood of producing > collisions on cvterm unique keys. > > It sounds like "Ad hoc:synonym" is actually from a cv of property types > that can be attached using featureprop et al. This is very definitely not > an ad-hoc ontology, it's crucial that featureprop types have their own cv > and are defined Actually, I only have a few things in 'local'. They are things that are fundamental to making chado work (or the gff loader work), like score and synonym. There are several other ad hoc ontologies that have more descriptive names like 'property type', and 'Statistical terms'. I think this arrangement makes as much sense as FlyBase chado's use of the synonym type ontology with exactly one term in it: synonym. Darn it, as soon as I wrote that last sentence, I was reminded of how it's not true: Lincoln was complaining to me a few months ago about WormBase's lack of typed synonyms. He said it would be nice to have 'GenBank synonym', 'Swissprot synonym', etc. I happily pointed out that chado could easily do that. Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. ca...@cs... GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory |
|
From: Chris M. <cj...@fr...> - 2005-02-22 16:13:34
|
On Mon, 21 Feb 2005, Scott Cain wrote: > On Mon, 2005-02-21 at 19:49 -0800, Chris Mungall wrote: > > > > On Mon, 21 Feb 2005, Scott Cain wrote: > > > > > On Mon, 2005-02-21 at 19:18 -0800, Chris Mungall wrote: > > > > the dbxref must be persistent and consistent across database > > > > instances > > > > > > Why? By there very nature, they are going to be the sort of thing that > > > are locally used and not global. If they were more generally used, they > > > should be on OBO. > > > > Not necessarily. If you look at most of the non-obo ontologies out there > > typically available in OWL format, most conflate the name and ID > > > > > > what about making dbxref = cv.name + cvterm.name? > > > > > > Seems ugly. So an accession of "Ad Hoc Ontology:synonym" is what you > > > are suggesting? > > > > Eh? > > > > A cv.name should never be "Ad hoc" because there will be collisions > > between cvs > > Fair enough. Would you prefer 'local'? Sorry, I was being cryptic. What I meant was, even these "ad hoc" ontologies must have some kind of name that communicates the nature of the cvterms within them. It seemed like you were planning on lumping all ad-hoc ontologies together, which has a high likelihood of producing collisions on cvterm unique keys. It sounds like "Ad hoc:synonym" is actually from a cv of property types that can be attached using featureprop et al. This is very definitely not an ad-hoc ontology, it's crucial that featureprop types have their own cv and are defined > > > > > > > > On Mon, 21 Feb 2005, Scott Cain wrote: > > > > > > > > > Hi Chris, > > > > > > > > > > I am dreading writing this email a little bit, given the discussion that > > > > > occurred a few weeks ago involving dbxrefs and cvterms. Anyway, I > > > > > gather from the comments you added to cvterm that you really did intend > > > > > to make a unique index on dbxref_id in cvterm, and I even understand why > > > > > you did it given that much of the time, the cvs are coming from a real > > > > > ontology and they have accessions. Of course, some of the time, there > > > > > will be "ad hoc" cvs that won't have accessions. The solution I am > > > > > suggesting is the creation of a db sequence and items that don't belong > > > > > to a formal ontology will get the next available value from the > > > > > sequence. Does that sound OK to you? > > > > > > > > > > Thanks, > > > > > Scott > > > > > > > > > > > > > > > > > > > > |
|
From: Scott C. <ca...@cs...> - 2005-02-22 03:54:23
|
On Mon, 2005-02-21 at 19:49 -0800, Chris Mungall wrote: > > On Mon, 21 Feb 2005, Scott Cain wrote: > > > On Mon, 2005-02-21 at 19:18 -0800, Chris Mungall wrote: > > > the dbxref must be persistent and consistent across database > > > instances > > > > Why? By there very nature, they are going to be the sort of thing that > > are locally used and not global. If they were more generally used, they > > should be on OBO. > > Not necessarily. If you look at most of the non-obo ontologies out there > typically available in OWL format, most conflate the name and ID > > > > what about making dbxref = cv.name + cvterm.name? > > > > Seems ugly. So an accession of "Ad Hoc Ontology:synonym" is what you > > are suggesting? > > Eh? > > A cv.name should never be "Ad hoc" because there will be collisions > between cvs Fair enough. Would you prefer 'local'? > > > > > > > On Mon, 21 Feb 2005, Scott Cain wrote: > > > > > > > Hi Chris, > > > > > > > > I am dreading writing this email a little bit, given the discussion that > > > > occurred a few weeks ago involving dbxrefs and cvterms. Anyway, I > > > > gather from the comments you added to cvterm that you really did intend > > > > to make a unique index on dbxref_id in cvterm, and I even understand why > > > > you did it given that much of the time, the cvs are coming from a real > > > > ontology and they have accessions. Of course, some of the time, there > > > > will be "ad hoc" cvs that won't have accessions. The solution I am > > > > suggesting is the creation of a db sequence and items that don't belong > > > > to a formal ontology will get the next available value from the > > > > sequence. Does that sound OK to you? > > > > > > > > Thanks, > > > > Scott > > > > > > > > > > > > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. ca...@cs... GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory |
|
From: Chris M. <cj...@fr...> - 2005-02-22 03:49:33
|
On Mon, 21 Feb 2005, Scott Cain wrote: > On Mon, 2005-02-21 at 19:18 -0800, Chris Mungall wrote: > > the dbxref must be persistent and consistent across database > > instances > > Why? By there very nature, they are going to be the sort of thing that > are locally used and not global. If they were more generally used, they > should be on OBO. Not necessarily. If you look at most of the non-obo ontologies out there typically available in OWL format, most conflate the name and ID > > what about making dbxref = cv.name + cvterm.name? > > Seems ugly. So an accession of "Ad Hoc Ontology:synonym" is what you > are suggesting? Eh? A cv.name should never be "Ad hoc" because there will be collisions between cvs > > > > On Mon, 21 Feb 2005, Scott Cain wrote: > > > > > Hi Chris, > > > > > > I am dreading writing this email a little bit, given the discussion that > > > occurred a few weeks ago involving dbxrefs and cvterms. Anyway, I > > > gather from the comments you added to cvterm that you really did intend > > > to make a unique index on dbxref_id in cvterm, and I even understand why > > > you did it given that much of the time, the cvs are coming from a real > > > ontology and they have accessions. Of course, some of the time, there > > > will be "ad hoc" cvs that won't have accessions. The solution I am > > > suggesting is the creation of a db sequence and items that don't belong > > > to a formal ontology will get the next available value from the > > > sequence. Does that sound OK to you? > > > > > > Thanks, > > > Scott > > > > > > > > > |
|
From: Scott C. <ca...@cs...> - 2005-02-22 03:27:46
|
On Mon, 2005-02-21 at 19:18 -0800, Chris Mungall wrote: > the dbxref must be persistent and consistent across database > instances Why? By there very nature, they are going to be the sort of thing that are locally used and not global. If they were more generally used, they should be on OBO. > > what about making dbxref = cv.name + cvterm.name? Seems ugly. So an accession of "Ad Hoc Ontology:synonym" is what you are suggesting? > > On Mon, 21 Feb 2005, Scott Cain wrote: > > > Hi Chris, > > > > I am dreading writing this email a little bit, given the discussion that > > occurred a few weeks ago involving dbxrefs and cvterms. Anyway, I > > gather from the comments you added to cvterm that you really did intend > > to make a unique index on dbxref_id in cvterm, and I even understand why > > you did it given that much of the time, the cvs are coming from a real > > ontology and they have accessions. Of course, some of the time, there > > will be "ad hoc" cvs that won't have accessions. The solution I am > > suggesting is the creation of a db sequence and items that don't belong > > to a formal ontology will get the next available value from the > > sequence. Does that sound OK to you? > > > > Thanks, > > Scott > > > > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. ca...@cs... GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory |
|
From: Chris M. <cj...@fr...> - 2005-02-22 03:18:30
|
the dbxref must be persistent and consistent across database instances what about making dbxref = cv.name + cvterm.name? On Mon, 21 Feb 2005, Scott Cain wrote: > Hi Chris, > > I am dreading writing this email a little bit, given the discussion that > occurred a few weeks ago involving dbxrefs and cvterms. Anyway, I > gather from the comments you added to cvterm that you really did intend > to make a unique index on dbxref_id in cvterm, and I even understand why > you did it given that much of the time, the cvs are coming from a real > ontology and they have accessions. Of course, some of the time, there > will be "ad hoc" cvs that won't have accessions. The solution I am > suggesting is the creation of a db sequence and items that don't belong > to a formal ontology will get the next available value from the > sequence. Does that sound OK to you? > > Thanks, > Scott > > |
|
From: Scott C. <ca...@cs...> - 2005-02-22 03:15:20
|
Hi Chris, I am dreading writing this email a little bit, given the discussion that occurred a few weeks ago involving dbxrefs and cvterms. Anyway, I gather from the comments you added to cvterm that you really did intend to make a unique index on dbxref_id in cvterm, and I even understand why you did it given that much of the time, the cvs are coming from a real ontology and they have accessions. Of course, some of the time, there will be "ad hoc" cvs that won't have accessions. The solution I am suggesting is the creation of a db sequence and items that don't belong to a formal ontology will get the next available value from the sequence. Does that sound OK to you? Thanks, Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. ca...@cs... GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory |
|
From: Pinglei Z. <zh...@mo...> - 2005-02-15 22:25:47
|
>> >> > Now to address you questions (mostly with more question): first, are you >> > using the current XML-XORT (ie, the separate repository, not the one >> > under the schema repository)? Next, when you installed gmod, did you set >> > up the environment variables as suggested in the INSTALL doc? If so, >> > you should get a prompt to use Bio::GMOD::Config when you run XORT's >> > Makefile.PL. Answering yes to that should make the XORT installer use >> > complete.sql which was installed when you installed gmod (it probably >> > went in /usr/local/gmod/src/modules). Finally note that I just fixed a >> > bug in the XORT installer, so if you are using an anonymous checkout, it >> > may take a day or so for you to get the bug fix (I can send you the >> > fixed file separately if you want). >> >> I have to admit I'm still a bit confused by the installation process - >> perhaps XORT should have it's own INSTALL document? > >Whoops! Yes, there should be one. Anyone want to write it? there is one XML-XORT/README, and another one in more details XML-XORT/doc/readme_xort, is this not enough for INSTALL ? I do miss some test files for make test. >> Maybe my confusion >> comes from the fact that I'm currently looking at just XORT & Chado and >> not trying to install GMOD as a whole. Is this not to be recommended? > >Right; so if you start with installing gmod/chado from it's Makefile.PL, >one of the things that gets installed is Bio::GMOD::Config, which reads >configuration information from files in /usr/local/gmod/conf. The XORT >Makefile.PL checks for the existence of that module and prompts the user >to use that information. >> >> Is there any reason not to cvs rm the deprecated schema/XMLTools/XORT/ >> directory, since that is now superseded by gmod/XML-XORT? > >No, I was reluctant to do it because I didn't want to be responsible for >losing the history of the files. If Pinglei doesn't mind, I can ask the >SF admins to remove that directory. >> >> Note that the xort test suite seems to fail at the moment, due to the >> absence of a XML::XORT module > >Right again; nobody has taken the time to write tests. >> >> Scott - is there any plan to switch to using the SQL::T templates for >> generating the xort conf files? On the one hand, the existing >> xort_ddl_properties_creator.pl is a little hacky and is perhaps not robust >> to changes in the style of DDL. On the other hand, this script seems happy >> with things like function definitions in the complete.sql file (which I >> believe SQL::T will barf at). Furthermore, I'm a little worried by the >> 'CPAN bloat' in SQL::T - it used to be a fairly lightweight module, but >> the list of dependencies seems to have grown rather a lot. I was >> previously enthusiastic about a switch to using SQL::T but I'm not so sure >> now. > >I sympathize with your concerns about CPAN bloat. Bundle::GMOD has >quite a few modules in it. My philosophy lately has been, if it >installs fine with the cpan shell, who cares how many modules there are? Not really, I am working the doc "using chado to store annotation genome" for CPIB, pretend to be biologist and should be able to set up the system with limited knowledge. It turn out to be VERY annoying, seems need endless more modues in order to install one modules. FYI, I am testing it on window with cygwin. core perl come with cygwin have very basical modules. It need lots of additional ones for bioperl, XORT, not mention whole GMOD package. >That said, SQL::T is currently an optional module for gmod; users only >need it if they want to modify the schema. If they don't modify the >schema, the installer uses default files that are created when I create >a distribution (well, anytime the schema tables get changed really). I >wouldn't mind seeing pregenerated files for XORT living in the chado cvs >that get used as the default files, while if the user changes the >schema, they have to let XORT regenerate the files. > >As for migrating to SQL::T for XORT, it doesn't matter much to me. I >wrote a proof of concept template for generating ddl_properties using >SQL::T last fall, but it didn't quite work. I'm sure if we wanted to go >down that path we could fix it. The question I have is this: how >fragile is xort_ddl_properties_creator.pl? Should we invest the time >migrating now to save headaches later? No objection from me to replace xort_ddl_properties_creator.pl if have better way. Actually xort_ddl_properties_creator.pl work fine if all create table statements following the same style(e.g how to specify foreign keys). I have to admit only test on postgres ddl, not other DBMS ddl. >> I think you and Pinglei have done an excellent job documenting all this >> code & making it easy to use. I think the XORT stuff is 95% there. > >Thanks. >> >> I have one request as far as XORT is concerned. Would it not be possible >> to allow runtime config rather than install-time config. All this setting >> of chado variables at Makefile.PL stage doesn't seem quite right. I'm not >> sure why things have to be set in Constants.pm. > >I am inclined to agree, but I don't have the time at the moment to make >the changes. >> >> Surely it would be fairly easy to make Constants.pm lookup a config file >> at runtime? This could be a standard ini file like this: >> >> [database] >> name=chado >> conf_dir=~/chado/conf >> dbport=5432 >> >> [database] >> name=test_chado >> conf_dir=~/test_chado/conf >> >> or some xml equivalent? It is totally doable. pinglei |
|
From: Pinglei Z. <zh...@mo...> - 2005-02-15 22:08:00
|
>I second this from Chris about run-time configuration with XORT. I've done
>that/had to do that to get it working for flybase server uses.
Originally set as runtime config, some how see objections and changed to the
ways it is now. Maybe should adopt both ways ?
Pinglei
>>I have one request as far as XORT is concerned. Would it not be possible
>>to allow runtime config rather than install-time config. All this setting
>>of chado variables at Makefile.PL stage doesn't seem quite right. I'm not
>>sure why things have to be set in Constants.pm.
>
>Basically this one needs some config-file lookup code.
>-- Don
>
>package XML::XORT::Util::GeneralUtil::Constants;
>use strict;
>use Cwd;
>use File::Spec;
>
>## dgg here to make this path-independent
>use File::Spec::Functions qw/ catdir catfile /;
>use FindBin qw( $Bin );
>use Cwd qw(abs_path);
>
>use vars qw/$myroot $CONF_DIR %CONSTANT/;
>
>## stuff from installer I don't want
># my %CONSTANT=(DBPORT=>'7302',
># SQLFILE=>'/Users/gilbertd/mailf/04oct/docs/FB_XORT/conf/chado.ddl',
># TMP=>'/bio/biodb/common/perl/XORT2/tmp',
># DBHOST=>'localhost',
># CONF=>'/bio/biodb/common/perl/XORT2/conf',
># DBNAME=>'dmel_chado',
># DBUSER=>'flybase',
># );
>
>sub constants
>{
> $CONF_DIR= confdir();
> # return %CONSTANT if scalar(%CONSTANT);
> return (
> TMP => $ENV{XORT_TMP} || $ENV{TMP} || "/tmp",
> CONF => $CONF_DIR,
> DBNAME => $ENV{CHADO_DB_NAME}|| $ENV{DBNAME}|| 'chado',
> DBPORT => $ENV{CHADO_DB_PORT}|| $ENV{DBPORT}|| 7302,
> DBUSER => $ENV{CHADO_DB_USERNAME}|| $ENV{DBUSER}|| '',
> DBPASSWORD => $ENV{CHADO_DB_PASSWORD}|| $ENV{DBPASSWORD}|| '',
> );
>}
>
>sub confdir
>{
> return $CONF_DIR if($CONF_DIR && -d $CONF_DIR);
> if (defined $ENV{'GMOD_ROOT'} && -d $ENV{'GMOD_ROOT'}) { $myroot=
$ENV{'GMOD_ROOT'}; }
> else { $myroot= abs_path($Bin); }
> $myroot= _updir( $myroot, ["conf"] ); # in case we are not in bin/ folder
> return $ENV{XORT_CONF} if $ENV{XORT_CONF};
> foreach my $d ($myroot, $ENV{XORT}, $ENV{CodeBase}) {
> return "$d/conf" if ($d && -d $d && -d "$d/conf");
> }
> return "./conf";
>}
>
>
>sub _updir
>{
> my ($atdir,$todirs)= @_;
> my $dir= $atdir;
> my $ok= 0;
> my $cod;
> foreach my $td (@$todirs) { $cod= catdir($dir, $td); $ok= (-d $cod); last if
$ok; }
> while (!$ok && length($dir)>2) {
> $dir= "$dir/..";
> $dir=`cd "$dir" && pwd`; chomp($dir);
> foreach my $td (@$todirs) { $cod= catdir($dir, $td); $ok= (-d $cod); last
if $ok; }
> }
> return ($ok) ? $dir : $atdir;
>}
>
>
>BEGIN {
> $CONF_DIR= confdir();
> %CONSTANT= constants();
>}
>#------------------------------------
>
>sub new (){
> my $type=shift;
> my $self={};
> bless $self, $type;
> return $self;
>}
>
>sub get_constant (){
> my $self=shift;
> my $key=shift;
> return $CONSTANT{$key} || $ENV{$key};
>}
>
>sub add_constant(){
> my $self=shift;
> my $key=shift;
> my $value=shift;
> $CONSTANT{$key}=$value;;
>}
>
>1;
>
>
>
>-------------------------------------------------------
>SF email is sponsored by - The IT Product Guide
>Read honest & candid reviews on hundreds of IT Products from real users.
>Discover which products truly live up to the hype. Start reading now.
>http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
>_______________________________________________
>Gmod-schema mailing list
>Gmo...@li...
>https://lists.sourceforge.net/lists/listinfo/gmod-schema
|
|
From: Chris M. <cj...@fr...> - 2005-02-15 22:04:43
|
On Tue, 15 Feb 2005, Scott Cain wrote: > On Tue, 2005-02-15 at 13:09 -0800, Chris Mungall wrote: > > > > On Tue, 15 Feb 2005, Scott Cain wrote: > > > > > Hi Linda, > > > > > > I am cc'ing this to the schema mailing list because it is an importan= t > > > question. > > > > > > I am inclined to say that the whole schema/chado/dat/ directory, whic= h > > > contains chado.* should be removed as it is quite out of date; any > > > objections? (It is currently ignored when distributions are created > > > because it isn't in the MANIFEST file). > > > > No objections from me > > > > > Also, I would advocate removing the Flybase specific files from XML- > > > XORT/conf/ , including chado.ddl, chado.dtd, ddl.properties and the > > > trigger files. They could be moved to another directory in the > > > repository like contrib/ . > > > > In what sense are these FlyBase specific files (other than perhaps the > > trigger code)? > > They are FlyBase specific because they are for the FlyBase flavor of > chado, which could certainly confuse non-flybase folks. but isn't chado.ddl the same for everyone? ah, perhaps not, as some people may include/exclude certain modules, i see. > > Presumably most of these are all files that are auto-generated by the X= ORT > > Makefile.PL, so there is no need to keep them in cvs > > Probably so--so they should be removed. > > > > The trigger code may benefit other non-FB users. It seems that the > > appropriate place to keep this is in > > chado/modules/<module>/flybase/flybase-<module>-triggers.sql > > There are more generic triggers that I've been working on that reside in > schema/chado/modules/sequence/apollo-bridge/ . I don't much care where > the flybase triggers go, but your suggestion seems fine. > > > > Now to address you questions (mostly with more question): first, are = you > > > using the current XML-XORT (ie, the separate repository, not the one > > > under the schema repository)? Next, when you installed gmod, did you = set > > > up the environment variables as suggested in the INSTALL doc? If so, > > > you should get a prompt to use Bio::GMOD::Config when you run XORT's > > > Makefile.PL. Answering yes to that should make the XORT installer us= e > > > complete.sql which was installed when you installed gmod (it probably > > > went in /usr/local/gmod/src/modules). Finally note that I just fixed= a > > > bug in the XORT installer, so if you are using an anonymous checkout,= it > > > may take a day or so for you to get the bug fix (I can send you the > > > fixed file separately if you want). > > > > I have to admit I'm still a bit confused by the installation process - > > perhaps XORT should have it's own INSTALL document? > > Whoops! Yes, there should be one. Anyone want to write it? > > > Maybe my confusion > > comes from the fact that I'm currently looking at just XORT & Chado and > > not trying to install GMOD as a whole. Is this not to be recommended? > > Right; so if you start with installing gmod/chado from it's Makefile.PL, > one of the things that gets installed is Bio::GMOD::Config, which reads > configuration information from files in /usr/local/gmod/conf. The XORT > Makefile.PL checks for the existence of that module and prompts the user > to use that information. OK, I see. I still feel that this a bit back-to-front, as I think of xort as being a dependency of chado-xml rather than vice versa > > Is there any reason not to cvs rm the deprecated schema/XMLTools/XORT/ > > directory, since that is now superseded by gmod/XML-XORT? > > No, I was reluctant to do it because I didn't want to be responsible for > losing the history of the files. If Pinglei doesn't mind, I can ask the > SF admins to remove that directory. I think it's fine just to cvs rm the files. the directories will hang around annoyingly since there's no way of truly deleting them with cvs, but they will be invisible so long as people cvs update with the prune option > > Note that the xort test suite seems to fail at the moment, due to the > > absence of a XML::XORT module > > Right again; nobody has taken the time to write tests. fair enough - but the 'test' that is there fails... > > Scott - is there any plan to switch to using the SQL::T templates for > > generating the xort conf files? On the one hand, the existing > > xort_ddl_properties_creator.pl is a little hacky and is perhaps not rob= ust > > to changes in the style of DDL. On the other hand, this script seems ha= ppy > > with things like function definitions in the complete.sql file (which I > > believe SQL::T will barf at). Furthermore, I'm a little worried by the > > 'CPAN bloat' in SQL::T - it used to be a fairly lightweight module, but > > the list of dependencies seems to have grown rather a lot. I was > > previously enthusiastic about a switch to using SQL::T but I'm not so s= ure > > now. > > I sympathize with your concerns about CPAN bloat. Bundle::GMOD has > quite a few modules in it. My philosophy lately has been, if it > installs fine with the cpan shell, who cares how many modules there are? > That said, SQL::T is currently an optional module for gmod; users only > need it if they want to modify the schema. If they don't modify the > schema, the installer uses default files that are created when I create > a distribution (well, anytime the schema tables get changed really). I > wouldn't mind seeing pregenerated files for XORT living in the chado cvs > that get used as the default files, while if the user changes the > schema, they have to let XORT regenerate the files. That sounds fine (so long as they're kept in sync of course!) > As for migrating to SQL::T for XORT, it doesn't matter much to me. I > wrote a proof of concept template for generating ddl_properties using > SQL::T last fall, but it didn't quite work. I'm sure if we wanted to go > down that path we could fix it. The question I have is this: how > fragile is xort_ddl_properties_creator.pl? Should we invest the time > migrating now to save headaches later? I don't know... > > I think you and Pinglei have done an excellent job documenting all this > > code & making it easy to use. I think the XORT stuff is 95% there. > > Thanks. > > > > I have one request as far as XORT is concerned. Would it not be possibl= e > > to allow runtime config rather than install-time config. All this setti= ng > > of chado variables at Makefile.PL stage doesn't seem quite right. I'm n= ot > > sure why things have to be set in Constants.pm. > > I am inclined to agree, but I don't have the time at the moment to make > the changes. well it sounds like Don may already have the code to do this... > > Surely it would be fairly easy to make Constants.pm lookup a config fil= e > > at runtime? This could be a standard ini file like this: > > > > [database] > > name=3Dchado > > conf_dir=3D~/chado/conf > > dbport=3D5432 > > > > [database] > > name=3Dtest_chado > > conf_dir=3D~/test_chado/conf > > > > or some xml equivalent? > > > > > Scott > > > > > > > > > > > > On Tue, 2005-02-15 at 10:53 +0100, Linda Sperling wrote: > > > > Hi Scott, > > > > > > > > The problem persists as concerns seeing double features, any insigh= t is > > > > welcome. > > > > > > > > I have a (probably stupid) question (and Olivier is not around toda= y). > > > > Maybe you can answer it. > > > > > > > > I am spending a lot of effort trying to transitiion our stock > > > > collection database into the chado genetic module. I now understand= it, > > > > I understand how it will work for our stocks and for our rnai data,= and > > > > I even have added a submodule (on paper) for the stocks (as suggest= ed > > > > by whoever put comments in the genetic.sql). Great. > > > > > > > > So I decided to poke into the XORT and XML stuff, to figure out how= to > > > > dump our present database into chado xml, so that Olivier can then = work > > > > on how to load it and test it. > > > > > > > > I cannot find any xml or dtd or ddl file that incorporates the curr= ent > > > > (I hope!) genetic module. I can only find things that mention genot= ype > > > > and phenotype and interaction tables which no longer exist in the c= hado > > > > genetic.sql!!!! > > > > > > > > Is this because I am supposed to know how to generate the dtd mysel= f > > > > from the chado.sql ? > > > > > > > > Or is this because pinglei and co. have not yet payed attention to = the > > > > new genetic module in the XML-XORT package? > > > > > > > > > > > > I am crossing my fingers that the chado.sql (which corresponds to = Dave > > > > Emmert's presentation at the April GMOD meeting in Cambridge) **is*= * > > > > the recent version -- because it will work for us! But I am surpris= ed > > > > that the chado.dtd and chado.ddl in CVS under gmod/XML-XORT/conf i= s > > > > not up to date. > > > > > > > > any insight welcome! > > > > > > > > thanks, > > > > Linda > > > > > > > > > > > > > > > > On Feb 14, 2005, at 3:45 PM, Scott Cain wrote: > > > > > > > > > Hi Linda, > > > > > > > > > > I just wanted to let you know I haven't forgotten about this. I = ran > > > > > into a problem last week while doing a test install of chado--mak= e sure > > > > > that Olivier doesn't upgrade to the current version of Graph (0.5= 5); it > > > > > either makes it impossible to load GO or makes it VERY slow (so s= low > > > > > that the two are indistinguishable on my computer). Anyway, I sh= ould > > > > > be > > > > > moving on to gbrowse problems today. > > > > > > > > > > Scott > > > > > > > > > > > > > > > On Mon, 2005-02-07 at 17:56 +0100, Linda Sperling wrote: > > > > >> Yes, he is using the yeast data, as well as Paramecium data, eit= her > > > > >> dataset gives the problem. > > > > >> > > > > >> But if you don't have that problem with the yeast data... how c= an you > > > > >> figure out what is going wrong ?? > > > > >> > > > > >> L > > > > >> > > > > >> > > > > >> On Feb 7, 2005, at 5:40 PM, Scott Cain wrote: > > > > >> > > > > >>> I've got ideas out the wazoo; the trick, as always, is to narro= w it > > > > >>> down > > > > >>> to the right one :-) > > > > >>> > > > > >>> I suspect that there is a poorly constructed query in the adapt= or. > > > > >>> I'm > > > > >>> installing postgres as I write this. I'll install the yeast dat= a and > > > > >>> do > > > > >>> some testing. Is that what you are using now? > > > > >>> > > > > >>> Thanks, > > > > >>> Scott > > > > >>> > > > > >>> > > > > >>> On Mon, 2005-02-07 at 17:34 +0100, Linda Sperling wrote: > > > > >>>> Scott, > > > > >>>> > > > > >>>> Unfortunately, that's not it. > > > > >>>> If you have any other ideas... > > > > >>>> > > > > >>>> Thanks, > > > > >>>> Linda > > > > >>>> On Feb 7, 2005, at 4:44 PM, Scott Cain wrote: > > > > >>>> > > > > >>>>> Hi Linda, > > > > >>>>> > > > > >>>>> I've seen this before, though I don't remember feature_dbxref= being > > > > >>>>> involved. Perhaps this is not the same problem but is relate= d. I > > > > >>>>> don't > > > > >>>>> have postgres installed on my new hard drive yet (I do have t= he > > > > >>>>> source > > > > >>>>> code though, so it's just a matter of time :-). Ask Olivier = to > > > > >>>>> find > > > > >>>>> the > > > > >>>>> following snippet of code in Browser.pm in the feature2label > > > > >>>>> method: > > > > >>>>> > > > > >>>>> # WARNING: if too many features start showing up in tracks, > > > > >>>>> uncomment > > > > >>>>> # the following line and comment the one after that. > > > > >>>>> # @label =3D $self->type2label($basetype,$length) unless= @label; > > > > >>>>> push @label,$self->type2label($basetype,$length); > > > > >>>>> > > > > >>>>> and switch the commenting on the last two lines and see if th= at > > > > >>>>> helps. > > > > >>>>> Browser.pm should be in a path like > > > > >>>>> perl5/site_perl/5.8.5/i386-linux- > > > > >>>>> thread-multi/Bio/Graphics. > > > > >>>>> > > > > >>>>> Thanks, > > > > >>>>> Scott > > > > >>>>> > > > > >>>>> On Mon, 2005-02-07 at 14:53 +0100, Linda Sperling wrote: > > > > >>>>>> Hello Scott, > > > > >>>>>> > > > > >>>>>> =09We are having a strange problem with gbrowse running on c= hado. > > > > >>>>>> > > > > >>>>>> =09Olivier is loading GFF3 format into chado. The database s= eems to > > > > >>>>>> be > > > > >>>>>> correctly loaded. However, any feature with a dbxref in the = 9th > > > > >>>>>> GFF3 > > > > >>>>>> field, comes up double in gbrowse! The feature_dbxref and db= xref > > > > >>>>>> tables > > > > >>>>>> are loaded correctly. Any feature without dbxref shows up o= nce, > > > > >>>>>> as > > > > >>>>>> expected (and desired). > > > > >>>>>> > > > > >>>>>> =09Any ideas? > > > > >>>>>> > > > > >>>>>> Thanks, > > > > >>>>>> Linda > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> Linda Sperling > > > > >>>>>> Centre de G=C3=A9n=C3=A9tique Mol=C3=A9culaire > > > > >>>>>> CNRS > > > > >>>>>> Avenue de la Terrasse > > > > >>>>>> 91198 Gif-sur-Yvette CEDEX > > > > >>>>>> FRANCE > > > > >>>>>> > > > > >>>>>> spe...@cg... > > > > >>>>>> +33 (0)1 69 82 32 09 =09(telephone) > > > > >>>>>> +33 (0)1 69 82 31 50=09(fax) > > > > >>>>>> http://paramecium.cgm.cnrs-gif.fr > > > > >>>>>> > > > > >>>>>> SAUVONS LA RECHERCHE > > > > >>>>>> http://recherche-en-danger.apinc.org > > > > >>>>>> > > > > >>>>>> > > > > >>>>> -- > > > > >>>>> -------------------------------------------------------------= ------ > > > > >>>>> -- > > > > >>>>> -- > > > > >>>>> - > > > > >>>>> Scott Cain, Ph. D. > > > > >>>>> ca...@cs... > > > > >>>>> GMOD Coordinator (http://www.gmod.org/) > > > > >>>>> 216-392-3087 > > > > >>>>> Cold Spring Harbor Laboratory > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>> -- > > > > >>> ---------------------------------------------------------------= ------ > > > > >>> -- > > > > >>> - > > > > >>> Scott Cain, Ph. D. > > > > >>> ca...@cs... > > > > >>> GMOD Coordinator (http://www.gmod.org/) > > > > >>> 216-392-3087 > > > > >>> Cold Spring Harbor Laboratory > > > > >> > > > > >> > > > > >> > > > > > -- > > > > > -----------------------------------------------------------------= ------ > > > > > - > > > > > Scott Cain, Ph. D. > > > > > ca...@cs... > > > > > GMOD Coordinator (http://www.gmod.org/) > > > > > 216-392-3087 > > > > > Cold Spring Harbor Laboratory > > > > > > > > > Linda Sperling > > > > Centre de G=C3=A9n=C3=A9tique Mol=C3=A9culaire > > > > CNRS > > > > Avenue de la Terrasse > > > > 91198 Gif-sur-Yvette CEDEX > > > > FRANCE > > > > > > > > spe...@cg... > > > > +33 (0)1 69 82 32 09 =09(telephone) > > > > +33 (0)1 69 82 31 50=09(fax) > > > > http://paramecium.cgm.cnrs-gif.fr > > > > > > > > SAUVONS LA RECHERCHE > > > > http://recherche-en-danger.apinc.org > > > > > > > > > > > > > > > > > > > > |