Elegant Python function to convert CamelCase to snake

例:

>>> convert('CamelCase')
'camel_case'

This is pretty thorough:

def convert(name):
    s1 = re.sub('(.)([A-Z][a-z]+)', r'1_2', name)
    return re.sub('([a-z0-9])([A-Z])', r'1_2', s1).lower()

Works with all these (and doesn't harm already-un-cameled versions):

>>> convert('CamelCase')
'camel_case'
>>> convert('CamelCamelCase')
'camel_camel_case'
>>> convert('Camel2Camel2Case')
'camel2_camel2_case'
>>> convert('getHTTPResponseCode')
'get_http_response_code'
>>> convert('get2HTTPResponseCode')
'get2_http_response_code'
>>> convert('HTTPResponseCode')
'http_response_code'
>>> convert('HTTPResponseCodeXYZ')
'http_response_code_xyz'

Or if you're going to call it a zillion times, you can pre-compile the regexes:

first_cap_re = re.compile('(.)([A-Z][a-z]+)')
all_cap_re = re.compile('([a-z0-9])([A-Z])')
def convert(name):
    s1 = first_cap_re.sub(r'1_2', name)
    return all_cap_re.sub(r'1_2', s1).lower()

Don't forget to import the regular expression module

import re

There's an inflection library in the package index that can handle these things for you. In this case, you'd be looking for inflection.underscore() :

>>> inflection.underscore('CamelCase')
'camel_case'

I don't know why these are all so complicating.

for most cases the simple expression ([AZ]+) will do the trick

>>> re.sub('([A-Z]+)', r'_1','CamelCase').lower()
'_camel_case'  
>>> re.sub('([A-Z]+)', r'_1','camelCase').lower()
'camel_case'
>>> re.sub('([A-Z]+)', r'_1','camel2Case2').lower()
'camel2_case2'
>>> re.sub('([A-Z]+)', r'_1','camelCamelCase').lower()
'camel_camel_case'
>>> re.sub('([A-Z]+)', r'_1','getHTTPResponseCode').lower()
'get_httpresponse_code'

To ignore the first charachter simply add look behind (?!^)

>>> re.sub('(?!^)([A-Z]+)', r'_1','CamelCase').lower()
'camel_case'
>>> re.sub('(?!^)([A-Z]+)', r'_1','CamelCamelCase').lower()
'camel_camel_case'
>>> re.sub('(?!^)([A-Z]+)', r'_1','Camel2Camel2Case').lower()
'camel2_camel2_case'
>>> re.sub('(?!^)([A-Z]+)', r'_1','getHTTPResponseCode').lower()
'get_httpresponse_code'

If you want to separate ALLCaps to all_caps and expect numbers in your string you still don't need to do two separate runs just use | This expression ((?<=[a-z0-9])[AZ]|(?!^)[AZ](?=[az])) can handle just about every scenario in the book

>>> a = re.compile('((?<=[a-z0-9])[A-Z]|(?!^)[A-Z](?=[a-z]))')
>>> a.sub(r'_1', 'getHTTPResponseCode').lower()
'get_http_response_code'
>>> a.sub(r'_1', 'get2HTTPResponseCode').lower()
'get2_http_response_code'
>>> a.sub(r'_1', 'get2HTTPResponse123Code').lower()
'get2_http_response123_code'
>>> a.sub(r'_1', 'HTTPResponseCode').lower()
'http_response_code'
>>> a.sub(r'_1', 'HTTPResponseCodeXYZ').lower()
'http_response_code_xyz'

It all depends on what you want so use the solution that best fits your needs as it should not be overly complicated.

nJoy!

链接地址: http://www.djcxy.com/p/5438.html

上一篇: 将列表的字符串表示转换为列表

下一篇: 优雅的Python函数将CamelCase转换为蛇