[Template] Suffix automaton

principle

Define two attributes for each point on the automaton: [min, max] and right set. This point represents (the set of right endpoints with length at [min, max] and all occurrence positions is a substring of right)

Then it can be proved that the right sets of two nodes either do not intersect or contain each other

Then, let the point whose right directly contains (the right of a point) as its father, and you can get a parent tree

Open another root point. If a point is not included by other points, point it to root

So you can only use len (the maximum length) and fa (the father in the parent tree) to represent a point

If there is (adding the character x after a substring represented by a point p can be represented by a point q), then there is a transfer p-x ->q

Therefore, the number of points and edges is O (n), which will not be proved.

practice

Consider online construction. Add a new character each time. Let it be x

Write down the last inserted point lst. Let's add a new point p so that it can represent the current whole string

Then first there is len [p]=len [lst]+1 (lst can represent the string before inserting x)

Considering lst and its ancestors, we can sort them from large to small by len. It can be found that [the shift with x] is monotonous, that is to say, none of the front ones, and all of the latter ones

Then I can add a transfer from x to p directly to those points without x transfer

Then consider the first point o with x transfer, and let it be q

Since len [p] must be larger than len [q], I want to try to connect the parent side from p to q

However, if you connect directly, you will find that len may be somewhat unqualified (considering aabab, if you directly insert 5 into the right set (originally {3}) representing aab, len will become smaller, and aab will not be able to represent it)

There are two cases: len [q]==len [o]+1?

If they are equal, there is no problem above, just connect them directly

If not, I will open a new point nq to force him to be equal

The father and transfer of nq are the same as those of q, but len will become len [o]+1

Of course, the original q should be retained, and nq should be the father of q, then nq can be the father of p

Then there are the ancestors of those lst that can be transferred to q, and now they should be transferred to nq (we can find that these ancestors are also continuous after sorting just now)

In addition, if there is no ancestor to transfer x, just connect p to root

On average, the total complexity is about O (n)

 one inline int insert( int x, int o){ two      int p=++ pct; three len[p]=len[o]+ one ; four      for (;o&&!tr[o][x];o=fa[o]) tr[o][x]= p; five      if (!o){fa[p]=rt; return p;} six      int q= tr[o][x]; seven      if (len[q]==len[o]+ one ){fa[p]=q; return p;} eight      int nq=++ pct; nine fa[nq]=fa[q],memcpy(tr[nq],tr[q], sizeof (tr[q])); ten fa[q]=fa[p]=nq; len[nq]=len[o]+ one ; eleven      for (;o&&tr[o][x]==q;o=fa[o]) tr[o][x]= nq; twelve      return p; thirteen }

Example

I'll make it up later

posted @ 2019-04-26 13:48   Ressed   Reading( one hundred and eighty-six Comments( zero edit   Collection   report