Time performance in Generating very large text file in Python

I need to generate a very large text file. Each line has a simple format: Seq_num<SPACE>num_val 12343234 759 Let's assume I am going to generate a file with 100million lines. I tried 2 approaches and surprisingly they are giving very different time performance. For loop over 100m. In each loop I make short string of seq_num<SPACE>num_val , and then I write that to a file.

时间表现在Python中生成非常大的文本文件

我需要生成一个非常大的文本文件。 每一行都有一个简单的格式: Seq_num<SPACE>num_val 12343234 759 我们假设我将生成1亿行文件。 我尝试了两种方法,令人惊讶的是他们给出了非常不同的时间表现。 超过100米的循环。 在每个循环中,我使用seq_num<SPACE>num_val短字符串,然后将其写入文件。 这种方法需要很长时间。 ## APPROACH 1 for seq_id in seq_ids: num_val=rand() line=seq_id+' '+num_v

Why is it possible to replace sometimes set() with {}?

In PyCharm, when I write: return set([(sy + ady, sx + adx)]) it says "Function call can be replaced with set literal" so it replaces it with: return {(sy + ady, sx + adx)} Why is that? A set() in Python is not the same as a dictionary {} ? And if it wants to optimize this, why is this more effective? Python sets and dictionaries can both be constructed using curly braces: my_

为什么有时可以用{}替换set()?

在PyCharm中,当我写道: return set([(sy + ady, sx + adx)]) 它说“函数调用可以用set literal替换”,所以它用下面的代替: return {(sy + ady, sx + adx)} 这是为什么? Python中的set()与字典不一样{} ? 如果它想优化这个,为什么这更有效? Python集和字典都可以使用大括号来构造: my_dict = {'a': 1, 'b': 2} my_set = {1, 2, 3} 口译员(和读者)可以根据其内容区分它们。 然而,它不可

[] and {} vs list() and dict(), which is better?

我知道它们基本上都是一样的东西,但在风格方面,哪个更好(更多Pythonic),用来创建一个空的列表或字典? In terms of speed, it's no competition for empty lists/dicts: >>> from timeit import timeit >>> timeit("[]") 0.040084982867934334 >>> timeit("list()") 0.17704233359267718 >>> timeit("{}") 0.033620194745424214 >>> timeit("dict()") 0.1821558326547077

[]和{} vs list()和dict(),哪个更好?

我知道它们基本上都是一样的东西,但在风格方面,哪个更好(更多Pythonic),用来创建一个空的列表或字典? 就速度而言,它对于空列表/字典没有竞争力: >>> from timeit import timeit >>> timeit("[]") 0.040084982867934334 >>> timeit("list()") 0.17704233359267718 >>> timeit("{}") 0.033620194745424214 >>> timeit("dict()") 0.1821558326547077 和非空的: >>&g

List of zeros in python

This question already has an answer here: Create an empty list in python with certain size 11 answers #add code here to figure out the number of 0's you need, naming the variable n. listofzeros = [0] * n if you prefer to put it in the function, just drop in that code and add return listofzeros Which would look like this: def zerolistmaker(n): listofzeros = [0] * n return listofzero

python中的零列表

这个问题在这里已经有了答案: 在python中创建一个空列表,其中有11个答案 #add code here to figure out the number of 0's you need, naming the variable n. listofzeros = [0] * n 如果你喜欢把它放在函数中,只需放入该代码并添加return listofzeros 这看起来像这样: def zerolistmaker(n): listofzeros = [0] * n return listofzeros 样本输出: >>> zerolistmaker(4) [0, 0, 0, 0] >>>

Why is "if not someobj:" better than "if someobj == None:" in Python?

I've seen several examples of code like this: if not someobj: #do something But I'm wondering why not doing: if someobj == None: #do something Is there any difference? Does one have an advantage over the other? In the first test, Python try to convert the object to a bool value if it is not already one. Roughly, we are asking the object : are you meaningful or not ? This

为什么在Python中“如果不是someobj:”比“if someobj == None:”更好?

我已经看到了几个这样的代码的例子: if not someobj: #do something 但我想知道为什么不这样做: if someobj == None: #do something 有什么区别吗? 一个人比另一个人有优势吗? 在第一个测试中,Python尝试将对象转换为一个bool值,如果它不是一个。 粗略地说, 我们在问这个对象:你是否有意义? 这是使用以下算法完成的: 如果对象有一个__nonzero__特殊方法(就像数字内置, int和float ),它会调用这

Installation error with ipython on macbook

I am trying to install ipython on my macbook using command $ sudo easy_install ipython Before that I have installed brew. But when i install ipython command, i am getting the following error: error: Setup script exited with error in ipython setup command: Invalid environment marker: sys_platform == "darwin" and platform_python_implementation == "CPython" Could someone help me how to solve t

ipython在macbook上安装时出错

我正尝试使用命令$ sudo easy_install ipython在我的macbook上安装$ sudo easy_install ipython 在此之前,我已经安装了brew。 但是,当我安装ipython命令,我收到以下错误: error: Setup script exited with error in ipython setup command: Invalid environment marker: sys_platform == "darwin" and platform_python_implementation == "CPython" 有人可以帮我解决这个问题吗? 我需要快速开发项目。 先谢谢你!

How to configure Sphinx auto flask to document flask

I have a flask app that I want to use Sphinx's autoflask directive to document a flask-restful API. https://pythonhosted.org/sphinxcontrib-httpdomain/#module-sphinxcontrib.autohttp.flask I have installed the module via pip and run sphinx-quickstart, which gives me a conf.py and index.rst. I've tried putting the extension into conf.py: extensions = ['sphinxcontrib.autohttp.flask']

如何配置狮身人面像汽车烧瓶文件烧瓶

我有一个烧瓶应用程序,我想使用Sphinx的autoflask指令来记录烧瓶平静的API。 https://pythonhosted.org/sphinxcontrib-httpdomain/#module-sphinxcontrib.autohttp.flask 我已经通过pip安装了该模块并运行了sphinx-quickstart,它给了我一个conf.py和index.rst。 我试着把扩展名放入conf.py: extensions = ['sphinxcontrib.autohttp.flask'] 并根据文档将指令放入index.rst中: .. autoflask:: autoflask_sampleapp:ap

Why is string's startswith slower than in?

Surprisingly, I find startswith is slower than in : In [10]: s="ABCD"*10 In [11]: %timeit s.startswith("XYZ") 1000000 loops, best of 3: 307 ns per loop In [12]: %timeit "XYZ" in s 10000000 loops, best of 3: 81.7 ns per loop As we all know, the in operation needs to search the whole string and startswith just needs to check the first few characters, so startswith should be more efficient. Wh

为什么字符串的开始比慢?

令人惊讶的是,我发现startswith比in : In [10]: s="ABCD"*10 In [11]: %timeit s.startswith("XYZ") 1000000 loops, best of 3: 307 ns per loop In [12]: %timeit "XYZ" in s 10000000 loops, best of 3: 81.7 ns per loop 大家都知道, in操作需要搜索整个字符串, startswith只需要检查的前几个字符,所以startswith应该更加高效。 当s足够大, startswith更快: In [13]: s="ABCD"*200 In [14]: %timeit s.startswi

Why is it slower to iterate over a small string than a small list?

I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It's almost 1.35 times as much time. >>> from timeit import timeit >>> timeit("[x for x in 'abc']") 2.0691067844831528 >>> timeit("[x for x in ['a', 'b', 'c'

为什么迭代一个小字符串比小列表慢?

我正在玩timeit,并注意到对一个小字符串做一个简单的列表理解比在一个小的单字符串列表上做同样的操作花费的时间要长。 任何解释? 这几乎是时间的1.35倍。 >>> from timeit import timeit >>> timeit("[x for x in 'abc']") 2.0691067844831528 >>> timeit("[x for x in ['a', 'b', 'c']]") 1.5286479570345861 在更低的层面上发生了什么? TL; DR 一旦大量的开销被移除,实际的速度差异接

Why is C++ much faster than python with boost?

My goal is to write a small library for spectral finite elements in Python and to that purpose I tried extending python with a C++ library using Boost, with the hope that it would make my code faster. class Quad { public: Quad(int, int); double integrate(boost::function<double(std::vector<double> const&)> const&); double integrate_wrapper(boost::py

为什么C ++比带boost的python快得多?

我的目标是在Python中编写一个用于光谱有限元的小型库,为此,我尝试使用Boost对C ++库进行扩展,希望能够使我的代码更快。 class Quad { public: Quad(int, int); double integrate(boost::function<double(std::vector<double> const&)> const&); double integrate_wrapper(boost::python::object const&); std::vector< std::vector<double> > node