Hwan Won, Chung - 2014-07-25

DNA-based detection needs universal primers and group-specific probe set.
Highly conserved regions in multiple sequences are good candidates of universal primers.
Specific primer set also can be used for detection purpose. So, We must search
conserved region or specific(unique) region for these works.
Finding universal or specific primer region from multiple sequences is not easy work
and it takes much laborious time and it has been shown that a good software
for that purpose does not exist currentyl from our survey. So, We aimed for developing new tool
to design universal primer and internal probes with convenience
and exactness for biological researcher and Matchup program was developed.

The major contributions of this program gives us following effects.

  • First, we tackled the problem of designing universal primers which can amplify
    multiple products from only one primer pair. All target DNA sequences should be
    multiply aligned in advance. From those aligned sequences, the most conserved
    regions should be found and they are used to design universal primer pairs. Groupspecific
    universal primer or semi-universal primers can be designed to amplify only
    partial products from all relevant sequences. The procedure to classify all sequences
    into their groups, finding semi-universal primer, and checking their uniqueness was
    implemented. This process was developed and named as UPMA (Universal Primer by
    Multiple Alignment) algorithm.

  • Second, to overcome the drawback of multiple alignment, new algorithm in
    which multiple alignment is unnecessary was developed to find universal primer. This
    method was based on suffix tree and multiple common substrings.
    From general suffix tree of all target DNA sequences, all multiple
    common substring could be produced with the number of matching sequnces. These
    common strings could be extended to left or right direction looking for regions of
    relatively low degeneracy. If some low-degeneracy regions were found, those regions
    could be checked for universal or semin-universal primers.

  • Third, Group or sequence specific primer pairs could be designed from multiple sequences.
    Their specificity was checked for each sequence and the distictness of their product size
    were available by manual checks or genetic algorithm based optimization. To minimize the number
    of primer pairs used for their PCR based separation, specific primers can be searched with
    its forward or reverse primer fixed. If one primer was fixed, then the other pairing primers
    were searched with product size constraints satisfied. So, AFLP(Amplification Fragment
    Length Polymorphis) experiment can be supplied and designed using this program.

  • Fourth, the design of group specific probe set is possible.
    Microarray based detection procedure can be performed easily if there exists
    some unique probe set which hybridize against only their target sequences but does not
    hybridize non-target sequences. When highly close target sequences are used for
    experiment, it is more likely that the unique probe set for them cannot be founded.
    Excluding those sequenes without unique probe set will make the experiment feasible but
    decoding range will be decreased. In this case, non-unique probes can be alternative
    choice. All candidates of probes for oligo array were searched. Each candidate was
    checked for its specificity about how it could hybridize to all target or non-target sequenes.
    Next, to minimize the number of probe set, optimization was done to exclude the
    redundancy of probe set. This method was based on integer linear programming.

  • Fifth, Previous tasks to design oligonucleotides and supplementary works could be
    executed with graphical user interface and platform dependent framework. It is also possible
    to execute most of the functions which were supplied at primer3 program
    since newly developed software incorporated primer3 as its basis. Further,
    multiple target sequences could be handled more easily for oligonucleotdie design.
    Design outputs also could be displayed and validated in graphical interface.

  • Finally, DNA sequences are too large to be searched by traditional algorithms. All target
    sequences for DNA-based detection should be fully and recurrently scanned for its
    matching or not matching information. Specially, when whole genome sequences were
    used for that purpose, large memory usage and high time complexity were essential.
    Suffix tree data structure was chosen to be used as search engine since it has linear
    construction and searching time. It was used for designing universal primers, checking
    the speicificy of prbe set, and various match process so that it is possible
    to supply feasible solution at the genome-level design of oligonucleotides.

Considering the similarity of target sequences, different methods can be used
to design specific primers or probes and the process can be easily executed under
graphic user interface and it has been proven Matchup program gives out exact results.