Find the longest common starting substring in a se

2020-01-25 03:41发布

This is a challenge to come up with the most elegant JavaScript, Ruby or other solution to a relatively trivial problem.

This problem is a more specific case of the Longest common substring problem. I need to only find the longest common starting substring in an array. This greatly simplifies the problem.

For example, the longest substring in [interspecies, interstelar, interstate] is "inters". However, I don't need to find "ific" in [specifics, terrific].

I've solved the problem by quickly coding up a solution in JavaScript as a part of my answer about shell-like tab-completion (test page here). Here is that solution, slightly tweaked:

function common_substring(data) {
  var i, ch, memo, idx = 0
  do {
    memo = null
    for (i=0; i < data.length; i++) {
      ch = data[i].charAt(idx)
      if (!ch) break
      if (!memo) memo = ch
      else if (ch != memo) break
    }
  } while (i == data.length && idx < data.length && ++idx)

  return (data[0] || '').slice(0, idx)
}

This code is available in this Gist along with a similar solution in Ruby. You can clone the gist as a git repo to try it out:

$ git clone git://gist.github.com/257891.git substring-challenge

I'm not very happy with those solutions. I have a feeling they might be solved with more elegance and less execution complexity—that's why I'm posting this challenge.

I'm going to accept as an answer the solution I find the most elegant or concise. Here is for instance a crazy Ruby hack I come up with—defining the & operator on String:

# works with Ruby 1.8.7 and above
class String
  def &(other)
    difference = other.to_str.each_char.with_index.find { |ch, idx|
      self[idx].nil? or ch != self[idx].chr
    }
    difference ? self[0, difference.last] : self
  end
end

class Array
  def common_substring
    self.inject(nil) { |memo, str| memo.nil? ? str : memo & str }.to_s
  end
end

Solutions in JavaScript or Ruby are preferred, but you can show off clever solution in other languages as long as you explain what's going on. Only code from standard library please.

Update: my favorite solutions

I've chosen the JavaScript sorting solution by kennebec as the "answer" because it struck me as both unexpected and genius. If we disregard the complexity of actual sorting (let's imagine it's infinitely optimized by the language implementation), the complexity of the solution is just comparing two strings.

Other great solutions:

Thanks for participating! As you can see from the comments, I learned a lot (even about Ruby).

30条回答
冷血范
2楼-- · 2020-01-25 04:03

I would do the following:

  1. Take the first string of the array as the initial starting substring.
  2. Take the next string of the array and compare the characters until the end of one of the strings is reached or a mismatch is found. If a mismatch is found, reduce starting substring to the length where the mismatch was found.
  3. Repeat step 2 until all strings have been tested.

Here’s a JavaScript implementation:

var array = ["interspecies", "interstelar", "interstate"],
    prefix = array[0],
    len = prefix.length;
for (i=1; i<array.length; i++) {
    for (j=0, len=Math.min(len,array[j].length); j<len; j++) {
        if (prefix[j] != array[i][j]) {
            len = j;
            prefix = prefix.substr(0, len);
            break;
        }
    }
}
查看更多
萌系小妹纸
3楼-- · 2020-01-25 04:03

Realizing the risk of this turning into a match of code golf (or is that the intention?), here's my solution using sed, copied from my answer to another SO question and shortened to 36 chars (30 of which are the actual sed expression). It expects the strings (each on a seperate line) to be supplied on standard input or in files passed as additional arguments.

sed 'N;s/^\(.*\).*\n\1.*$/\1\n\1/;D'

A script with sed in the shebang line weighs in at 45 chars:

#!/bin/sed -f
N;s/^\(.*\).*\n\1.*$/\1\n\1/;D

A test run of the script (named longestprefix), with strings supplied as a "here document":

$ ./longestprefix <<EOF
> interspecies
> interstelar
> interstate
> EOF
inters
$
查看更多
【Aperson】
4楼-- · 2020-01-25 04:05

Combining answers by kennebec, Florian F and jberryman yields the following Haskell one-liner:

commonPrefix l = map fst . takeWhile (uncurry (==)) $ zip (minimum l) (maximum l)

With Control.Arrow one can get a point-free form:

commonPrefix = map fst . takeWhile (uncurry (==)) . uncurry zip . (minimum &&& maximum)
查看更多
贪生不怕死
5楼-- · 2020-01-25 04:05

Golfed JS solution just for fun:

w=["hello", "hell", "helen"];
c=w.reduce(function(p,c){
    for(r="",i=0;p[i]==c[i];r+=p[i],i++){}
    return r;
});
查看更多
男人必须洒脱
6楼-- · 2020-01-25 04:06

In Python:

>>> from os.path import commonprefix
>>> commonprefix('interspecies interstelar interstate'.split())
'inters'
查看更多
够拽才男人
7楼-- · 2020-01-25 04:07

Doesn't seem that complicated if you're not too concerned about ultimate performance:

def common_substring(data)
  data.inject { |m, s| s[0,(0..m.length).find { |i| m[i] != s[i] }.to_i] }
end

One of the useful features of inject is the ability to pre-seed with the first element of the array being interated over. This avoids the nil memo check.

puts common_substring(%w[ interspecies interstelar interstate ]).inspect
# => "inters"
puts common_substring(%w[ feet feel feeble ]).inspect
# => "fee"
puts common_substring(%w[ fine firkin fail ]).inspect
# => "f"
puts common_substring(%w[ alpha bravo charlie ]).inspect
# => ""
puts common_substring(%w[ fork ]).inspect
# => "fork"
puts common_substring(%w[ fork forks ]).inspect
# => "fork"

Update: If golf is the game here, then 67 characters:

def f(d)d.inject{|m,s|s[0,(0..m.size).find{|i|m[i]!=s[i]}.to_i]}end
查看更多
登录 后发表回答