Parsing text by regex, split, tokinize, or hash
I am parsing a CSV file that contains text that represents duration, which might be any combination of hours, minutes, or both. For example:
I want to be able to do this: duration = h.hours + m.minutes
and make sure that h
is hour (if exists) and the same for m
.
I tried solving this with this regex /(d*)s?hourD*(d*)s?min/)
, but this won't detect minutes alone, or hours alone.
So I changed it to this /(d+)s?D*s?(d*)/
, but it's wrong too because there is no way to tell if the value is an hour or minute, so I can convert it to hour
or minutes
.
I am confused on which way could solve this problem in my app. Is it regex, hash, matching, or any other way? Any help or advice is appreciated.
This is pretty straightforward to match with regex if you know that there is at least one of those present in the string. For example:
(?:(d+)s*hours?)?s*(?:(d+)s*minutes?)?
Here's one fancy way:
def string_to_duration(string)
string.downcase.scan(/(d+)s+(hours?|minutes?)/).map do |number, unit|
number.to_i.send(unit)
end.reduce(:+)
end
Test:
require "active_support/all"
input = [
"1 hour 30 minutes",
"2 hours",
"45 minutes"
]
def string_to_duration(string)
string.downcase.scan(/(d+)s+(hours?|minutes?)/).map do |number, unit|
number.to_i.send(unit)
end.reduce(:+)
end
input.each do |str|
puts string_to_duration str
end
Output:
5400
7200
2700
Note: This would also accept duplicate units like "1 minute 1 minute 1 minute"
will print 180
.
这是我会做的,我相信这是最直接的方式:
str = "1 hour 30 minutes"
h = str[/(d+) hour/, 1].to_i rescue 0
m = str[/(d+) minute/, 1].to_i rescue 0
链接地址: http://www.djcxy.com/p/92830.html