Geminstaller C0 Coverage Information - RCov

/Users/woolley/.rvm/gems/ruby-1.8.7-p174@geminstaller/gems/diff-lcs-1.1.2/lib/diff/lcs.rb

Name Total Lines Lines of Code Total Coverage Code Coverage
/Users/woolley/.rvm/gems/ruby-1.8.7-p174@geminstaller/gems/diff-lcs-1.1.2/lib/diff/lcs.rb 1105 553
46.33%
6.51%

Key

Code reported as executed by Ruby looks like this...and this: this line is also marked as covered.Lines considered as run by rcov, but not reported by Ruby, look like this,and this: these lines were inferred by rcov (using simple heuristics).Finally, here's a line marked as not executed.

Coverage Details

1 #! /usr/env/bin ruby
2 #--
3 # Copyright 2004 Austin Ziegler <diff-lcs@halostatue.ca>
4 #   adapted from:
5 #     Algorithm::Diff (Perl) by Ned Konz <perl@bike-nomad.com>
6 #     Smalltalk by Mario I. Wolczko <mario@wolczko.com>
7 #   implements McIlroy-Hunt diff algorithm
8 #
9 # This program is free software. It may be redistributed and/or modified
10 # under the terms of the GPL version 2 (or later), the Perl Artistic
11 # licence, or the Ruby licence.
12 # 
13 # $Id: lcs.rb,v 1.9 2004/10/17 20:31:10 austin Exp $
14 #++
15 
16 module Diff
17     # = Diff::LCS 1.1.2
18     # Computes "intelligent" differences between two sequenced Enumerables.
19     # This is an implementation of the McIlroy-Hunt "diff" algorithm for
20     # Enumerable objects that include Diffable.
21     #
22     # Based on Mario I. Wolczko's <mario@wolczko.com> Smalltalk version
23     # (1.2, 1993) and Ned Konz's <perl@bike-nomad.com> Perl version
24     # (Algorithm::Diff).
25     #
26     # == Synopsis
27     #   require 'diff/lcs'
28     #
29     #   seq1 = %w(a b c e h j l m n p)
30     #   seq2 = %w(b c d e f j k l m r s t)
31     #
32     #   lcs = Diff::LCS.LCS(seq1, seq2)
33     #   diffs = Diff::LCS.diff(seq1, seq2)
34     #   sdiff = Diff::LCS.sdiff(seq1, seq2)
35     #   seq = Diff::LCS.traverse_sequences(seq1, seq2, callback_obj)
36     #   bal = Diff::LCS.traverse_balanced(seq1, seq2, callback_obj)
37     #   seq2 == Diff::LCS.patch(seq1, diffs)
38     #   seq2 == Diff::LCS.patch!(seq1, diffs)
39     #   seq1 == Diff::LCS.unpatch(seq2, diffs)
40     #   seq1 == Diff::LCS.unpatch!(seq2, diffs)
41     #   seq2 == Diff::LCS.patch(seq1, sdiff)
42     #   seq2 == Diff::LCS.patch!(seq1, sdiff)
43     #   seq1 == Diff::LCS.unpatch(seq2, sdiff)
44     #   seq1 == Diff::LCS.unpatch!(seq2, sdiff)
45     #
46     # Alternatively, objects can be extended with Diff::LCS:
47     #
48     #   seq1.extend(Diff::LCS)
49     #   lcs = seq1.lcs(seq2)
50     #   diffs = seq1.diff(seq2)
51     #   sdiff = seq1.sdiff(seq2)
52     #   seq = seq1.traverse_sequences(seq2, callback_obj)
53     #   bal = seq1.traverse_balanced(seq2, callback_obj)
54     #   seq2 == seq1.patch(diffs)
55     #   seq2 == seq1.patch!(diffs)
56     #   seq1 == seq2.unpatch(diffs)
57     #   seq1 == seq2.unpatch!(diffs)
58     #   seq2 == seq1.patch(sdiff)
59     #   seq2 == seq1.patch!(sdiff)
60     #   seq1 == seq2.unpatch(sdiff)
61     #   seq1 == seq2.unpatch!(sdiff)
62     # 
63     # Default extensions are provided for Array and String objects through
64     # the use of 'diff/lcs/array' and 'diff/lcs/string'.
65     #
66     # == Introduction (by Mark-Jason Dominus)
67     # 
68     # <em>The following text is from the Perl documentation. The only
69     # changes have been to make the text appear better in Rdoc</em>.
70     #
71     # I once read an article written by the authors of +diff+; they said
72     # that they hard worked very hard on the algorithm until they found the
73     # right one.
74     #
75     # I think what they ended up using (and I hope someone will correct me,
76     # because I am not very confident about this) was the `longest common
77     # subsequence' method. In the LCS problem, you have two sequences of
78     # items:
79     #
80     #    a b c d f g h j q z
81     #    a b c d e f g i j k r x y z
82     #
83     # and you want to find the longest sequence of items that is present in
84     # both original sequences in the same order. That is, you want to find a
85     # new sequence *S* which can be obtained from the first sequence by
86     # deleting some items, and from the second sequence by deleting other
87     # items. You also want *S* to be as long as possible. In this case *S*
88     # is:
89     # 
90     #    a b c d f g j z
91     #
92     # From there it's only a small step to get diff-like output:
93     #
94     #    e   h i   k   q r x y
95     #    +   - +   +   - + + +
96     #
97     # This module solves the LCS problem. It also includes a canned function
98     # to generate +diff+-like output.
99     #
100     # It might seem from the example above that the LCS of two sequences is
101     # always pretty obvious, but that's not always the case, especially when
102     # the two sequences have many repeated elements. For example, consider
103     #
104     #    a x b y c z p d q
105     #    a b c a x b y c z
106     #
107     # A naive approach might start by matching up the +a+ and +b+ that
108     # appear at the beginning of each sequence, like this:
109     # 
110     #    a x b y c         z p d q
111     #    a   b   c a b y c z
112     #
113     # This finds the common subsequence +a b c z+. But actually, the LCS is
114     # +a x b y c z+:
115     #
116     #          a x b y c z p d q
117     #    a b c a x b y c z
118     #
119     # == Author
120     # This version is by Austin Ziegler <diff-lcs@halostatue.ca>.
121     #
122     # It is based on the Perl Algorithm::Diff by Ned Konz
123     # <perl@bike-nomad.com>, copyright &copy; 2000 - 2002 and the Smalltalk
124     # diff version by Mario I. Wolczko <mario@wolczko.com>, copyright &copy;
125     # 1993. Documentation includes work by Mark-Jason Dominus.
126     #
127     # == Licence
128     # Copyright &copy; 2004 Austin Ziegler
129     # This program is free software; you can redistribute it and/or modify it
130     # under the same terms as Ruby, or alternatively under the Perl Artistic
131     # licence.
132     #
133     # == Credits
134     # Much of the documentation is taken directly from the Perl
135     # Algorithm::Diff implementation and was written originally by Mark-Jason
136     # Dominus <mjd-perl-diff@plover.com> and later by Ned Konz. The basic Ruby
137     # implementation was re-ported from the Smalltalk implementation, available
138     # at ftp://st.cs.uiuc.edu/pub/Smalltalk/MANCHESTER/manchester/4.0/diff.st
139     #
140     # #sdiff and #traverse_balanced were written for the Perl version by Mike
141     # Schilli <m@perlmeister.com>.
142     #
143     # "The algorithm is described in <em>A Fast Algorithm for Computing Longest
144     # Common Subsequences</em>, CACM, vol.20, no.5, pp.350-353, May 1977, with
145     # a few minor improvements to improve the speed."
146   module LCS
147     VERSION = '1.1.2'
148   end
149 end
150 
151 require 'diff/lcs/callbacks'
152 
153 module Diff::LCS
154     # Returns an Array containing the longest common subsequence(s) between
155     # +self+ and +other+. See Diff::LCS#LCS.
156     #
157     #   lcs = seq1.lcs(seq2)
158   def lcs(other, &block) #:yields self[ii] if there are matched subsequences:
159     Diff::LCS.LCS(self, other, &block)
160   end
161 
162     # Returns the difference set between +self+ and +other+. See
163     # Diff::LCS#diff.
164   def diff(other, callbacks = nil, &block)
165     Diff::LCS::diff(self, other, callbacks, &block)
166   end
167 
168     # Returns the balanced ("side-by-side") difference set between +self+ and
169     # +other+. See Diff::LCS#sdiff.
170   def sdiff(other, callbacks = nil, &block)
171     Diff::LCS::sdiff(self, other, callbacks, &block)
172   end
173 
174     # Traverses the discovered longest common subsequences between +self+ and
175     # +other+. See Diff::LCS#traverse_sequences.
176   def traverse_sequences(other, callbacks = nil, &block)
177     traverse_sequences(self, other, callbacks || Diff::LCS::YieldingCallbacks,
178                        &block)
179   end
180 
181     # Traverses the discovered longest common subsequences between +self+ and
182     # +other+ using the alternate, balanced algorithm. See
183     # Diff::LCS#traverse_balanced.
184   def traverse_balanced(other, callbacks = nil, &block)
185     traverse_balanced(self, other, callbacks || Diff::LCS::YieldingCallbacks,
186                       &block)
187   end
188 
189     # Attempts to patch a copy of +self+ with the provided +patchset+. See
190     # Diff::LCS#patch.
191   def patch(patchset)
192     Diff::LCS::patch(self.dup, patchset)
193   end
194 
195     # Attempts to unpatch a copy of +self+ with the provided +patchset+.
196     # See Diff::LCS#patch.
197   def unpatch(patchset)
198     Diff::LCS::unpatch(self.dup, patchset)
199   end
200 
201     # Attempts to patch +self+ with the provided +patchset+. See
202     # Diff::LCS#patch!. Does no autodiscovery.
203   def patch!(patchset)
204     Diff::LCS::patch!(self, patchset)
205   end
206 
207     # Attempts to unpatch +self+ with the provided +patchset+. See
208     # Diff::LCS#unpatch. Does no autodiscovery.
209   def unpatch!(patchset)
210     Diff::LCS::unpatch!(self, patchset)
211   end
212 end
213 
214 module Diff::LCS
215   class << self
216       # Given two sequenced Enumerables, LCS returns an Array containing their
217       # longest common subsequences.
218       #
219       #   lcs = Diff::LCS.LCS(seq1, seq2)
220       #
221       # This array whose contents is such that:
222       #
223       #   lcs.each_with_index do |ee, ii|
224       #     assert(ee.nil? || (seq1[ii] == seq2[ee]))
225       #   end
226       #
227       # If a block is provided, the matching subsequences will be yielded from
228       # +seq1+ in turn and may be modified before they are placed into the
229       # returned Array of subsequences.
230     def LCS(seq1, seq2, &block) #:yields seq1[ii] for each matched:
231       matches = Diff::LCS.__lcs(seq1, seq2)
232       ret = []
233       matches.each_with_index do |ee, ii|
234         unless matches[ii].nil?
235           if block_given?
236             ret << (yield seq1[ii])
237           else
238             ret << seq1[ii]
239           end
240         end
241       end
242       ret
243     end
244 
245       # Diff::LCS.diff computes the smallest set of additions and deletions
246       # necessary to turn the first sequence into the second, and returns a
247       # description of these changes.
248       # 
249       # See Diff::LCS::DiffCallbacks for the default behaviour. An alternate
250       # behaviour may be implemented with Diff::LCS::ContextDiffCallbacks.
251       # If a Class argument is provided for +callbacks+, #diff will attempt
252       # to initialise it. If the +callbacks+ object (possibly initialised)
253       # responds to #finish, it will be called.
254     def diff(seq1, seq2, callbacks = nil, &block) # :yields diff changes:
255       callbacks ||= Diff::LCS::DiffCallbacks
256       if callbacks.kind_of?(Class)
257         cb = callbacks.new rescue callbacks
258         callbacks = cb
259       end
260       traverse_sequences(seq1, seq2, callbacks)
261       callbacks.finish if callbacks.respond_to?(:finish)
262 
263       if block_given?
264         res = callbacks.diffs.map do |hunk|
265           if hunk.kind_of?(Array)
266             hunk = hunk.map { |block| yield block }
267           else
268             yield hunk
269           end
270         end
271         res
272       else
273         callbacks.diffs
274       end
275     end
276 
277       # Diff::LCS.sdiff computes all necessary components to show two sequences
278       # and their minimized differences side by side, just like the Unix
279       # utility <em>sdiff</em> does:
280       #
281       #     old        <     -
282       #     same             same
283       #     before     |     after
284       #     -          >     new
285       #
286       # See Diff::LCS::SDiffCallbacks for the default behaviour. An alternate
287       # behaviour may be implemented with Diff::LCS::ContextDiffCallbacks. If
288       # a Class argument is provided for +callbacks+, #diff will attempt to
289       # initialise it. If the +callbacks+ object (possibly initialised)
290       # responds to #finish, it will be called.
291     def sdiff(seq1, seq2, callbacks = nil, &block) #:yields diff changes:
292       callbacks ||= Diff::LCS::SDiffCallbacks
293       if callbacks.kind_of?(Class)
294         cb = callbacks.new rescue callbacks
295         callbacks = cb
296       end
297       traverse_balanced(seq1, seq2, callbacks)
298       callbacks.finish if callbacks.respond_to?(:finish)
299 
300       if block_given?
301         res = callbacks.diffs.map do |hunk|
302           if hunk.kind_of?(Array)
303             hunk = hunk.map { |block| yield block }
304           else
305             yield hunk
306           end
307         end
308         res
309       else
310         callbacks.diffs
311       end
312     end
313 
314       # Diff::LCS.traverse_sequences is the most general facility provided by this
315       # module; +diff+ and +LCS+ are implemented as calls to it.
316       #
317       # The arguments to #traverse_sequences are the two sequences to
318       # traverse, and a callback object, like this:
319       #
320       #   traverse_sequences(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
321       #
322       # #diff is implemented with #traverse_sequences.
323       #
324       # == Callback Methods
325       # Optional callback methods are <em>emphasized</em>.
326       #
327       # callbacks#match::               Called when +a+ and +b+ are pointing
328       #                                 to common elements in +A+ and +B+.
329       # callbacks#discard_a::           Called when +a+ is pointing to an
330       #                                 element not in +B+.
331       # callbacks#discard_b::           Called when +b+ is pointing to an
332       #                                 element not in +A+.
333       # <em>callbacks#finished_a</em>:: Called when +a+ has reached the end of
334       #                                 sequence +A+.
335       # <em>callbacks#finished_b</em>:: Called when +b+ has reached the end of
336       #                                 sequence +B+.
337       #
338       # == Algorithm
339       #       a---+
340       #           v
341       #       A = a b c e h j l m n p
342       #       B = b c d e f j k l m r s t
343       #           ^
344       #       b---+
345       #
346       # If there are two arrows (+a+ and +b+) pointing to elements of
347       # sequences +A+ and +B+, the arrows will initially point to the first
348       # elements of their respective sequences. #traverse_sequences will
349       # advance the arrows through the sequences one element at a time,
350       # calling a method on the user-specified callback object before each
351       # advance. It will advance the arrows in such a way that if there are
352       # elements <tt>A[ii]</tt> and <tt>B[jj]</tt> which are both equal and
353       # part of the longest common subsequence, there will be some moment
354       # during the execution of #traverse_sequences when arrow +a+ is pointing
355       # to <tt>A[ii]</tt> and arrow +b+ is pointing to <tt>B[jj]</tt>. When
356       # this happens, #traverse_sequences will call <tt>callbacks#match</tt>
357       # and then it will advance both arrows.
358       #
359       # Otherwise, one of the arrows is pointing to an element of its sequence
360       # that is not part of the longest common subsequence.
361       # #traverse_sequences will advance that arrow and will call
362       # <tt>callbacks#discard_a</tt> or <tt>callbacks#discard_b</tt>, depending
363       # on which arrow it advanced. If both arrows point to elements that are
364       # not part of the longest common subsequence, then #traverse_sequences
365       # will advance one of them and call the appropriate callback, but it is
366       # not specified which it will call.
367       #
368       # The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>,
369       # and <tt>callbacks#discard_b</tt> are invoked with an event comprising
370       # the action ("=", "+", or "-", respectively), the indicies +ii+ and
371       # +jj+, and the elements <tt>A[ii]</tt> and <tt>B[jj]</tt>. Return
372       # values are discarded by #traverse_sequences.
373       #
374       # === End of Sequences
375       # If arrow +a+ reaches the end of its sequence before arrow +b+ does,
376       # #traverse_sequence try to call <tt>callbacks#finished_a</tt> with the
377       # last index and element of +A+ (<tt>A[-1]</tt>) and the current index
378       # and element of +B+ (<tt>B[jj]</tt>). If <tt>callbacks#finished_a</tt>
379       # does not exist, then <tt>callbacks#discard_b</tt> will be called on
380       # each element of +B+ until the end of the sequence is reached (the call
381       # will be done with <tt>A[-1]</tt> and <tt>B[jj]</tt> for each element).
382       #
383       # If +b+ reaches the end of +B+ before +a+ reaches the end of +A+,
384       # <tt>callbacks#finished_b</tt> will be called with the current index
385       # and element of +A+ (<tt>A[ii]</tt>) and the last index and element of
386       # +B+ (<tt>A[-1]</tt>). Again, if <tt>callbacks#finished_b</tt> does not
387       # exist on the callback object, then <tt>callbacks#discard_a</tt> will
388       # be called on each element of +A+ until the end of the sequence is
389       # reached (<tt>A[ii]</tt> and <tt>B[-1]</tt>).
390       #
391       # There is a chance that one additional <tt>callbacks#discard_a</tt> or
392       # <tt>callbacks#discard_b</tt> will be called after the end of the
393       # sequence is reached, if +a+ has not yet reached the end of +A+ or +b+
394       # has not yet reached the end of +B+.
395     def traverse_sequences(seq1, seq2, callbacks = Diff::LCS::SequenceCallbacks, &block) #:yields change events:
396       matches = Diff::LCS.__lcs(seq1, seq2)
397 
398       run_finished_a = run_finished_b = false
399       string = seq1.kind_of?(String)
400 
401       a_size = seq1.size
402       b_size = seq2.size
403       ai = bj = 0
404 
405       (0 .. matches.size).each do |ii|
406         b_line = matches[ii]
407 
408         ax = string ? seq1[ii, 1] : seq1[ii]
409         bx = string ? seq2[bj, 1] : seq2[bj]
410 
411         if b_line.nil?
412           unless ax.nil?
413             event = Diff::LCS::ContextChange.new('-', ii, ax, bj, bx)
414             event = yield event if block_given?
415             callbacks.discard_a(event)
416           end
417         else
418           loop do
419             break unless bj < b_line
420             bx = string ? seq2[bj, 1] : seq2[bj]
421             event = Diff::LCS::ContextChange.new('+', ii, ax, bj, bx)
422             event = yield event if block_given?
423             callbacks.discard_b(event)
424             bj += 1
425           end
426           bx = string ? seq2[bj, 1] : seq2[bj]
427           event = Diff::LCS::ContextChange.new('=', ii, ax, bj, bx)
428           event = yield event if block_given?
429           callbacks.match(event)
430           bj += 1
431         end
432         ai = ii
433       end
434       ai += 1
435 
436         # The last entry (if any) processed was a match. +ai+ and +bj+ point
437         # just past the last matching lines in their sequences.
438       while (ai < a_size) or (bj < b_size)
439           # last A?
440         if ai == a_size and bj < b_size
441           if callbacks.respond_to?(:finished_a) and not run_finished_a
442             ax = string ? seq1[-1, 1] : seq1[-1]
443             bx = string ? seq2[bj, 1] : seq2[bj]
444             event = Diff::LCS::ContextChange.new('>', (a_size - 1), ax, bj, bx)
445             event = yield event if block_given?
446             callbacks.finished_a(event)
447             run_finished_a = true
448           else
449             ax = string ? seq1[ai, 1] : seq1[ai]
450             loop do
451               bx = string ? seq2[bj, 1] : seq2[bj]
452               event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
453               event = yield event if block_given?
454               callbacks.discard_b(event)
455               bj += 1
456               break unless bj < b_size
457             end
458           end
459         end
460 
461           # last B?
462         if bj == b_size and ai < a_size
463           if callbacks.respond_to?(:finished_b) and not run_finished_b
464             ax = string ? seq1[ai, 1] : seq1[ai]
465             bx = string ? seq2[-1, 1] : seq2[-1]
466             event = Diff::LCS::ContextChange.new('<', ai, ax, (b_size - 1), bx)
467             event = yield event if block_given?
468             callbacks.finished_b(event)
469             run_finished_b = true
470           else
471             bx = string ? seq2[bj, 1] : seq2[bj]
472             loop do
473               ax = string ? seq1[ai, 1] : seq1[ai]
474               event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
475               event = yield event if block_given?
476               callbacks.discard_a(event)
477               ai += 1
478               break unless bj < b_size
479             end
480           end
481         end
482 
483         if ai < a_size
484           ax = string ? seq1[ai, 1] : seq1[ai]
485           bx = string ? seq2[bj, 1] : seq2[bj]
486           event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
487           event = yield event if block_given?
488           callbacks.discard_a(event)
489           ai += 1
490         end
491 
492         if bj < b_size
493           ax = string ? seq1[ai, 1] : seq1[ai]
494           bx = string ? seq2[bj, 1] : seq2[bj]
495           event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
496           event = yield event if block_given?
497           callbacks.discard_b(event)
498           bj += 1
499         end
500       end
501     end
502 
503       # #traverse_balanced is an alternative to #traverse_sequences. It
504       # uses a different algorithm to iterate through the entries in the
505       # computed longest common subsequence. Instead of viewing the changes as
506       # insertions or deletions from one of the sequences, #traverse_balanced
507       # will report <em>changes</em> between the sequences. To represent a
508       #
509       # The arguments to #traverse_balanced are the two sequences to traverse
510       # and a callback object, like this:
511       #
512       #   traverse_balanced(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
513       #
514       # #sdiff is implemented with #traverse_balanced.
515       #
516       # == Callback Methods
517       # Optional callback methods are <em>emphasized</em>.
518       #
519       # callbacks#match::               Called when +a+ and +b+ are pointing
520       #                                 to common elements in +A+ and +B+.
521       # callbacks#discard_a::           Called when +a+ is pointing to an
522       #                                 element not in +B+.
523       # callbacks#discard_b::           Called when +b+ is pointing to an
524       #                                 element not in +A+.
525       # <em>callbacks#change</em>::     Called when +a+ and +b+ are pointing
526       #                                 to the same relative position, but
527       #                                 <tt>A[a]</tt> and <tt>B[b]</tt> are
528       #                                 not the same; a <em>change</em> has
529       #                                 occurred.
530       #
531       # #traverse_balanced might be a bit slower than #traverse_sequences,
532       # noticable only while processing huge amounts of data.
533       #
534       # The +sdiff+ function of this module is implemented as call to
535       # #traverse_balanced.
536       #
537       # == Algorithm
538       #       a---+
539       #           v
540       #       A = a b c e h j l m n p
541       #       B = b c d e f j k l m r s t
542       #           ^
543       #       b---+
544       #
545       # === Matches
546       # If there are two arrows (+a+ and +b+) pointing to elements of
547       # sequences +A+ and +B+, the arrows will initially point to the first
548       # elements of their respective sequences. #traverse_sequences will
549       # advance the arrows through the sequences one element at a time,
550       # calling a method on the user-specified callback object before each
551       # advance. It will advance the arrows in such a way that if there are
552       # elements <tt>A[ii]</tt> and <tt>B[jj]</tt> which are both equal and
553       # part of the longest common subsequence, there will be some moment
554       # during the execution of #traverse_sequences when arrow +a+ is pointing
555       # to <tt>A[ii]</tt> and arrow +b+ is pointing to <tt>B[jj]</tt>. When
556       # this happens, #traverse_sequences will call <tt>callbacks#match</tt>
557       # and then it will advance both arrows.
558       #
559       # === Discards
560       # Otherwise, one of the arrows is pointing to an element of its sequence
561       # that is not part of the longest common subsequence.
562       # #traverse_sequences will advance that arrow and will call
563       # <tt>callbacks#discard_a</tt> or <tt>callbacks#discard_b</tt>,
564       # depending on which arrow it advanced.
565       #
566       # === Changes
567       # If both +a+ and +b+ point to elements that are not part of the longest
568       # common subsequence, then #traverse_sequences will try to call
569       # <tt>callbacks#change</tt> and advance both arrows. If
570       # <tt>callbacks#change</tt> is not implemented, then
571       # <tt>callbacks#discard_a</tt> and <tt>callbacks#discard_b</tt> will be
572       # called in turn.
573       #
574       # The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>,
575       # <tt>callbacks#discard_b</tt>, and <tt>callbacks#change</tt> are
576       # invoked with an event comprising the action ("=", "+", "-", or "!",
577       # respectively), the indicies +ii+ and +jj+, and the elements
578       # <tt>A[ii]</tt> and <tt>B[jj]</tt>. Return values are discarded by
579       # #traverse_balanced.
580       #
581       # === Context
582       # Note that +ii+ and +jj+ may not be the same index position, even if
583       # +a+ and +b+ are considered to be pointing to matching or changed
584       # elements.
585     def traverse_balanced(seq1, seq2, callbacks = Diff::LCS::BalancedCallbacks)
586       matches = Diff::LCS.__lcs(seq1, seq2)
587       a_size = seq1.size
588       b_size = seq2.size
589       ai = bj = mb = 0
590       ma = -1
591       string = seq1.kind_of?(String)
592 
593         # Process all the lines in the match vector.
594       loop do
595           # Find next match indices +ma+ and +mb+
596         loop do
597           ma += 1
598           break unless ma < matches.size and matches[ma].nil?
599         end
600 
601         break if ma >= matches.size # end of matches?
602         mb = matches[ma]
603 
604           # Change(seq2)
605         while (ai < ma) or (bj < mb)
606           ax = string ? seq1[ai, 1] : seq1[ai]
607           bx = string ? seq2[bj, 1] : seq2[bj]
608 
609           case [(ai < ma), (bj < mb)]
610           when [true, true]
611             if callbacks.respond_to?(:change)
612               event = Diff::LCS::ContextChange.new('!', ai, ax, bj, bx)
613               event = yield event if block_given?
614               callbacks.change(event)
615               ai += 1
616               bj += 1
617             else
618               event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
619               event = yield event if block_given?
620               callbacks.discard_a(event)
621               ai += 1
622               ax = string ? seq1[ai, 1] : seq1[ai]
623               event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
624               event = yield event if block_given?
625               callbacks.discard_b(event)
626               bj += 1
627             end
628           when [true, false]
629             event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
630             event = yield event if block_given?
631             callbacks.discard_a(event)
632             ai += 1
633           when [false, true]
634             event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
635             event = yield event if block_given?
636             callbacks.discard_b(event)
637             bj += 1
638           end
639         end
640 
641           # Match
642         ax = string ? seq1[ai, 1] : seq1[ai]
643         bx = string ? seq2[bj, 1] : seq2[bj]
644         event = Diff::LCS::ContextChange.new('=', ai, ax, bj, bx)
645         event = yield event if block_given?
646         callbacks.match(event)
647         ai += 1
648         bj += 1
649       end
650 
651       while (ai < a_size) or (bj < b_size)
652         ax = string ? seq1[ai, 1] : seq1[ai]
653         bx = string ? seq2[bj, 1] : seq2[bj]
654 
655         case [(ai < a_size), (bj < b_size)]
656         when [true, true]
657           if callbacks.respond_to?(:change)
658             event = Diff::LCS::ContextChange.new('!', ai, ax, bj, bx)
659             event = yield event if block_given?
660             callbacks.change(event)
661             ai += 1
662             bj += 1
663           else
664             event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
665             event = yield event if block_given?
666             callbacks.discard_a(event)
667             ai += 1
668             ax = string ? seq1[ai, 1] : seq1[ai]
669             event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
670             event = yield event if block_given?
671             callbacks.discard_b(event)
672             bj += 1
673           end
674         when [true, false]
675           event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
676           event = yield event if block_given?
677           callbacks.discard_a(event)
678           ai += 1
679         when [false, true]
680           event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
681           event = yield event if block_given?
682           callbacks.discard_b(event)
683           bj += 1
684         end
685       end
686     end
687 
688     PATCH_MAP = { #:nodoc:
689       :patch => { '+' => '+', '-' => '-', '!' => '!', '=' => '=' },
690       :unpatch => { '+' => '-', '-' => '+', '!' => '!', '=' => '=' }
691     }
692 
693       # Given a patchset, convert the current version to the new
694       # version. If +direction+ is not specified (must be
695       # <tt>:patch</tt> or <tt>:unpatch</tt>), then discovery of the
696       # direction of the patch will be attempted.
697     def patch(src, patchset, direction = nil)
698       string = src.kind_of?(String)
699         # Start with a new empty type of the source's class
700       res = src.class.new
701 
702         # Normalize the patchset.
703       patchset = __normalize_patchset(patchset)
704 
705       direction ||= Diff::LCS.__diff_direction(src, patchset)
706       direction ||= :patch
707 
708       ai = bj = 0
709 
710       patchset.each do |change|
711           # Both Change and ContextChange support #action
712         action = PATCH_MAP[direction][change.action]
713 
714         case change
715         when Diff::LCS::ContextChange
716           case direction
717           when :patch
718             el = change.new_element
719             op = change.old_position
720             np = change.new_position
721           when :unpatch
722             el = change.old_element
723             op = change.new_position
724             np = change.old_position
725           end
726 
727           case action
728           when '-' # Remove details from the old string
729             while ai < op
730               res << (string ? src[ai, 1] : src[ai])
731               ai += 1
732               bj += 1
733             end
734             ai += 1
735           when '+'
736             while bj < np
737               res << (string ? src[ai, 1] : src[ai])
738               ai += 1
739               bj += 1
740             end
741 
742             res << el
743             bj += 1
744           when '='
745               # This only appears in sdiff output with the SDiff callback.
746               # Therefore, we only need to worry about dealing with a single
747               # element.
748             res << el
749 
750             ai += 1
751             bj += 1
752           when '!'
753             while ai < op
754               res << (string ? src[ai, 1] : src[ai])
755               ai += 1
756               bj += 1
757             end
758 
759             bj += 1
760             ai += 1
761 
762             res << el
763           end
764         when Diff::LCS::Change
765           case action
766           when '-'
767             while ai < change.position
768               res << (string ? src[ai, 1] : src[ai])
769               ai += 1
770               bj += 1
771             end
772             ai += 1
773           when '+'
774             while bj < change.position
775               res << (string ? src[ai, 1] : src[ai])
776               ai += 1
777               bj += 1
778             end
779 
780             bj += 1
781 
782             res << change.element
783           end
784         end
785       end
786 
787       while ai < src.size
788         res << (string ? src[ai, 1] : src[ai])
789         ai += 1
790         bj += 1
791       end
792 
793       res
794     end
795 
796       # Given a set of patchset, convert the current version to the prior
797       # version. Does no auto-discovery.
798     def unpatch!(src, patchset)
799       Diff::LCS.patch(src, patchset, :unpatch)
800     end
801 
802       # Given a set of patchset, convert the current version to the next
803       # version. Does no auto-discovery.
804     def patch!(src, patchset)
805       Diff::LCS.patch(src, patchset, :patch)
806     end
807 
808 # private
809       # Compute the longest common subsequence between the sequenced Enumerables
810       # +a+ and +b+. The result is an array whose contents is such that
811       #
812       #     result = Diff::LCS.__lcs(a, b)
813       #     result.each_with_index do |e, ii|
814       #       assert_equal(a[ii], b[e]) unless e.nil?
815       #     end
816     def __lcs(a, b)
817       a_start = b_start = 0
818       a_finish = a.size - 1
819       b_finish = b.size - 1
820       vector = []
821 
822         # Prune off any common elements at the beginning...
823       while (a_start <= a_finish) and
824             (b_start <= b_finish) and
825             (a[a_start] == b[b_start])
826         vector[a_start] = b_start
827         a_start += 1
828         b_start += 1
829       end
830 
831         # Now the end...
832       while (a_start <= a_finish) and
833             (b_start <= b_finish) and
834             (a[a_finish] == b[b_finish])
835         vector[a_finish] = b_finish
836         a_finish -= 1
837         b_finish -= 1
838       end
839 
840         # Now, compute the equivalence classes of positions of elements.
841       b_matches = Diff::LCS.__position_hash(b, b_start .. b_finish)
842 
843       thresh = []
844       links = []
845 
846       (a_start .. a_finish).each do |ii|
847         ai = a.kind_of?(String) ? a[ii, 1] : a[ii]
848         bm = b_matches[ai]
849         kk = nil
850         bm.reverse_each do |jj|
851           if kk and (thresh[kk] > jj) and (thresh[kk - 1] < jj)
852             thresh[kk] = jj
853           else
854             kk = Diff::LCS.__replace_next_larger(thresh, jj, kk)
855           end
856           links[kk] = [ (kk > 0) ? links[kk - 1] : nil, ii, jj ] unless kk.nil?
857         end
858       end
859 
860       unless thresh.empty?
861         link = links[thresh.size - 1]
862         while not link.nil?
863           vector[link[1]] = link[2]
864           link = link[0]
865         end
866       end
867 
868       vector
869     end
870 
871       # Find the place at which +value+ would normally be inserted into the
872       # Enumerable. If that place is already occupied by +value+, do nothing
873       # and return +nil+. If the place does not exist (i.e., it is off the end
874       # of the Enumerable), add it to the end. Otherwise, replace the element
875       # at that point with +value+. It is assumed that the Enumerable's values
876       # are numeric.
877       #
878       # This operation preserves the sort order.
879     def __replace_next_larger(enum, value, last_index = nil)
880         # Off the end?
881       if enum.empty? or (value > enum[-1])
882         enum << value
883         return enum.size - 1
884       end
885 
886         # Binary search for the insertion point
887       last_index ||= enum.size
888       first_index = 0
889       while (first_index <= last_index)
890         ii = (first_index + last_index) >> 1
891 
892         found = enum[ii]
893 
894         if value == found
895           return nil
896         elsif value > found
897           first_index = ii + 1
898         else
899           last_index = ii - 1
900         end
901       end
902 
903         # The insertion point is in first_index; overwrite the next larger
904         # value.
905       enum[first_index] = value
906       return first_index
907     end
908 
909       # If +vector+ maps the matching elements of another collection onto this
910       # Enumerable, compute the inverse +vector+ that maps this Enumerable
911       # onto the collection. (Currently unused.)
912     def __inverse_vector(a, vector)
913       inverse = a.dup
914       (0 ... vector.size).each do |ii|
915         inverse[vector[ii]] = ii unless vector[ii].nil?
916       end
917       inverse
918     end
919 
920       # Returns a hash mapping each element of an Enumerable to the set of
921       # positions it occupies in the Enumerable, optionally restricted to the
922       # elements specified in the range of indexes specified by +interval+.
923     def __position_hash(enum, interval = 0 .. -1)
924       hash = Hash.new { |hh, kk| hh[kk] = [] }
925       interval.each do |ii|
926         kk = enum.kind_of?(String) ? enum[ii, 1] : enum[ii]
927         hash[kk] << ii
928       end
929       hash
930     end
931 
932       # Examine the patchset and the source to see in which direction the
933       # patch should be applied.
934       #
935       # WARNING: By default, this examines the whole patch, so this could take
936       # some time. This also works better with Diff::LCS::ContextChange or
937       # Diff::LCS::Change as its source, as an array will cause the creation
938       # of one of the above.
939     def __diff_direction(src, patchset, limit = nil)
940       count = left = left_miss = right = right_miss = 0
941       string = src.kind_of?(String)
942 
943       patchset.each do |change|
944         count += 1
945 
946         case change
947         when Diff::LCS::Change
948             # With a simplistic change, we can't tell the difference between
949             # the left and right on '!' actions, so we ignore those. On '='
950             # actions, if there's a miss, we miss both left and right.
951           element = string ? src[change.position, 1] : src[change.position]
952 
953           case change.action
954           when '-'
955             if element == change.element
956               left += 1
957             else
958               left_miss += 1
959             end
960           when '+'
961             if element == change.element
962               right += 1
963             else
964               right_miss += 1
965             end
966           when '='
967             if element != change.element
968               left_miss += 1
969               right_miss += 1
970             end
971           end
972         when Diff::LCS::ContextChange
973           case change.action
974           when '-' # Remove details from the old string
975             element = string ? src[change.old_position, 1] : src[change.old_position]
976             if element == change.old_element
977               left += 1
978             else
979               left_miss += 1
980             end
981           when '+'
982             element = string ? src[change.new_position, 1] : src[change.new_position]
983             if element == change.new_element
984               right += 1
985             else
986               right_miss += 1
987             end
988           when '='
989             le = string ? src[change.old_position, 1] : src[change.old_position]
990             re = string ? src[change.new_position, 1] : src[change.new_position]
991 
992             left_miss += 1 if le != change.old_element
993             right_miss += 1 if re != change.new_element
994           when '!'
995             element = string ? src[change.old_position, 1] : src[change.old_position]
996             if element == change.old_element
997               left += 1
998             else
999               element = string ? src[change.new_position, 1] : src[change.new_position]
1000               if element == change.new_element
1001                 right += 1
1002               else
1003                 left_miss += 1
1004                 right_miss += 1
1005               end
1006             end
1007           end
1008         end
1009 
1010         break if not limit.nil? and count > limit
1011       end
1012 
1013       no_left = (left == 0) and (left_miss >= 0)
1014       no_right = (right == 0) and (right_miss >= 0)
1015 
1016       case [no_left, no_right]
1017       when [false, true]
1018         return :patch
1019       when [true, false]
1020         return :unpatch
1021       else
1022         raise "The provided patchset does not appear to apply to the provided value as either source or destination value."
1023       end
1024     end
1025 
1026       # Normalize the patchset. A patchset is always a sequence of changes, but
1027       # how those changes are represented may vary, depending on how they were
1028       # generated. In all cases we support, we also support the array
1029       # representation of the changes. The formats are:
1030       #
1031       #   [ # patchset <- Diff::LCS.diff(a, b)
1032       #     [ # one or more hunks
1033       #       Diff::LCS::Change # one or more changes
1034       #     ] ]
1035       #
1036       #   [ # patchset, equivalent to the above
1037       #     [ # one or more hunks
1038       #       [ action, line, value ] # one or more changes
1039       #     ] ]
1040       #
1041       #   [ # patchset <- Diff::LCS.diff(a, b, Diff::LCS::ContextDiffCallbacks)
1042       #     #       OR <- Diff::LCS.sdiff(a, b, Diff::LCS::ContextDiffCallbacks)
1043       #     [ # one or more hunks
1044       #       Diff::LCS::ContextChange # one or more changes
1045       #     ] ]
1046       #
1047       #   [ # patchset, equivalent to the above
1048       #     [ # one or more hunks
1049       #       [ action, [ old line, old value ], [ new line, new value ] ]
1050       #         # one or more changes
1051       #     ] ]
1052       #
1053       #   [ # patchset <- Diff::LCS.sdiff(a, b)
1054       #     #       OR <- Diff::LCS.diff(a, b, Diff::LCS::SDiffCallbacks)
1055       #     Diff::LCS::ContextChange # one or more changes
1056       #   ]
1057       #
1058       #   [ # patchset, equivalent to the above
1059       #     [ action, [ old line, old value ], [ new line, new value ] ]
1060       #       # one or more changes
1061       #   ]
1062       #
1063       # The result of this will be either of the following.
1064       #
1065       #   [ # patchset
1066       #     Diff::LCS::ContextChange # one or more changes
1067       #   ]
1068       #
1069       #   [ # patchset
1070       #     Diff::LCS::Change # one or more changes
1071       #   ]
1072       #
1073       # If either of the above is provided, it will be returned as such.
1074       #
1075     def __normalize_patchset(patchset)
1076       patchset.map do |hunk|
1077         case hunk
1078         when Diff::LCS::ContextChange, Diff::LCS::Change
1079           hunk
1080         when Array
1081           if (not hunk[0].kind_of?(Array)) and hunk[1].kind_of?(Array) and hunk[2].kind_of?(Array)
1082             Diff::LCS::ContextChange.from_a(hunk)
1083           else
1084             hunk.map do |change|
1085               case change
1086               when Diff::LCS::ContextChange, Diff::LCS::Change
1087                 change
1088               when Array
1089                   # change[1] will ONLY be an array in a ContextChange#to_a call.
1090                   # In Change#to_a, it represents the line (singular).
1091                 if change[1].kind_of?(Array)
1092                   Diff::LCS::ContextChange.from_a(change)
1093                 else
1094                   Diff::LCS::Change.from_a(change)
1095                 end
1096               end
1097             end
1098           end
1099         else
1100           raise ArgumentError, "Cannot normalise a hunk of class #{hunk.class}."
1101         end
1102       end.flatten
1103     end
1104   end
1105 end

Generated on Mon May 10 23:40:27 -0700 2010 with rcov 0.9.8