LeetCode Repeated DNA Sequences

187. Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

方法

这题理解题意很重要,题目的意思简单来说就是在这些子序列里找到会出现多次的子序列,这种题一般就是用HashMap来做。

public class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        HashSet<String> count = new HashSet<String>();
        HashSet<String> repeated = new HashSet<String>();
        for(int i = 0; i + 9 < s.length(); i++){
            String str = s.substring(i,i+10);
            if(!count.contains(str)){
                count.add(str);
            }else{
                repeated.add(str);
            }
        }
        return new ArrayList<String>(repeated);
    }
}

方法二

还有一种方法就是hashmap+bit

由于只有4种字符,那么可以用2位来表示,00,01,10,11,又因为有10个数,所以只需要20位。即每次移2位再&0xFFFFF,然后再用HashMap存查看。

Share