Download all the links(related documents) on a webpage using Python

I have to download a lot of documents from a webpage. They are wmv files, PDF, BMP etc. Of course, all of them have links to them. So each time, I have to RMC a file, select 'Save Link As' Then save then as type All Files. Is it possible to do this in Python? I search the SO DB and folks have answered question of how to get the links from the webpage. I want to download the actual fi

使用Python下载网页上的所有链接(相关文档)

我必须从网页下载大量文件。 它们是wmv文件,PDF,BMP等。当然,它们都有链接。 所以每一次,我必须RMC一个文件,选择'保存链接为',然后保存,然后键入所有文件。 是否有可能在Python中做到这一点? 我搜索SO DB,人们回答了如何从网页获取链接的问题。 我想下载实际的文件。 提前致谢。 (这不是一个硬件问题:))。 下面是一个如何从http://pypi.python.org/pypi/xlwt下载一些选定文件的例子 您需要首先安

How to load data from a file, for a unit test, in python?

I've written a specialized HTML parser, that I want to unit test with a couple of sample webpages I've downloaded. In Java, I've used class resources, to load data into unit tests, without having to rely on them being at a particular path on the file system. Is there a way to do this in Python? I found the doctest.testfile() function, but that appears to be specific to doctests.

如何在Python中加载文件中的数据以进行单元测试?

我写了一个专门的HTML解析器,我想用我下载的几个示例网页进行单元测试。 在Java中,我使用类资源来将数据加载到单元测试中,而不必依赖它们位于文件系统上的特定路径。 有没有办法在Python中做到这一点? 我找到了doctest.testfile()函数,但似乎是doctests特有的。 我只想得到一个文件句柄,到一个特定的HTML文件,这是相对于当前模块。 在此先感谢您的任何建议! 要从单元测试中的文件加载数据,如果测试数据与单元

get script directory name

I know I can use this to get the full file path os.path.dirname(os.path.realpath(__file__)) But I want just the name of the folder, my scrip is in. SO if I have my_script.py and it is located at /home/user/test/my_script.py I want to return "test" How could I do this? Thanks import os os.path.basename(os.path.dirname(os.path.realpath(__file__))) 分解: currentFile = __file__ # M

获取脚本目录名称

我知道我可以使用它来获取完整的文件路径 os.path.dirname(os.path.realpath(__file__)) 但我只想要文件夹的名称,我的脚本是在。如果我有my_script.py,它位于 /home/user/test/my_script.py 我想返回“测试”我怎么能做到这一点? 谢谢 import os os.path.basename(os.path.dirname(os.path.realpath(__file__))) 分解: currentFile = __file__ # May be 'my_script', or './my_script' or # '/

how do you get the current local directory in python

This question already has an answer here: Find current directory and file's directory [duplicate] 15 answers 我会使用basename import os path = os.getcwd() print(os.path.basename(path)) Try these import os print("Path at terminal when executing this file") print(os.getcwd() + "n") print("This file path, relative to os.getcwd()") print(__file__ + "n") print("This file full path (follow

你如何在Python中获取当前的本地目录

这个问题在这里已经有了答案: 查找当前目录和文件的目录[复制] 15个答案 我会使用basename import os path = os.getcwd() print(os.path.basename(path)) 试试这些 import os print("Path at terminal when executing this file") print(os.getcwd() + "n") print("This file path, relative to os.getcwd()") print(__file__ + "n") print("This file full path (following symlinks)") full_path = os.path.realpath(

How to indicate the current directory of the script not me?

This question already has an answer here: Find current directory and file's directory [duplicate] 15 answers __file__ is the path of the script, albeit often a relative path. Use: import os.path scriptdir = os.path.dirname(os.path.abspath(__file__)) to create an absolute path to the directory, then use os.path.join() to open your file: with open(os.path.join(scriptdir, './input1.txt'

如何指示脚本的当前目录不是我?

这个问题在这里已经有了答案: 查找当前目录和文件的目录[复制] 15个答案 __file__是脚本的路径,尽管通常是相对路径。 使用: import os.path scriptdir = os.path.dirname(os.path.abspath(__file__)) 创建目录的绝对路径,然后使用os.path.join()打开你的文件: with open(os.path.join(scriptdir, './input1.txt')) as openfile: 如果安装了setuptools / distribute; 你可以使用pkg_resources函数来访问文件: im

Importing modules from parent folder

I am running Python 2.5. This is my folder tree: ptdraft/ nib.py simulations/ life/ life.py (I also have __init__.py in each folder, omitted here for readability) How do I import the nib module from inside the life module? I am hoping it is possible to do without tinkering with sys.path. Note: The main module being run is in the ptdraft folder. It seems that the problem i

从父文件夹导入模块

我正在运行Python 2.5。 这是我的文件夹树: ptdraft/ nib.py simulations/ life/ life.py (我在每个文件夹中都有__init__.py ,这里为了便于阅读而省略) 如何从life模块内部导入nib模块? 我希望有可能没有修补sys.path。 注意:正在运行的主模块位于ptdraft文件夹中。 看来这个问题与模块在父目录或类似的东西中没有关系。 您需要将包含ptdraft的目录添加到PYTHONPATH中 你说import nib与你一

Change directory to the directory of a Python script

How do i change directory to the directory with my python script in? So far I figured out I should use os.chdir and sys.argv[0] . I'm sure there is a better way then to write my own function to parse argv[0]. os.chdir(os.path.dirname(__file__)) 有时__file__没有定义,在这种情况下,你可以尝试sys.path[0] os.chdir(os.path.dirname(os.path.abspath(__file__))) should do it. os.chdir(os.path.di

将目录切换到Python脚本的目录

我如何改变目录到我的Python脚本中的目录? 到目前为止,我发现我应该使用os.chdir和sys.argv[0] 。 我确信有更好的方法来编写我自己的函数来解析argv [0]。 os.chdir(os.path.dirname(__file__)) 有时__file__没有定义,在这种情况下,你可以尝试sys.path[0] os.chdir(os.path.dirname(os.path.abspath(__file__)))应该这样做。 如果脚本从它所在的目录运行, os.chdir(os.path.dirname(__file__))将不起作用。

How to know the path of the running script in Python?

My script.py creates a temporary file in the same dir as the script. When running it: python script.py it works just file but it doesn't work when you run: python /path/to/script.py That's because I'm using a relative path to my temp file in script.py rather than an absolute one. The problem is that I don't know in which path it will be running in, so I need a way to dina

如何知道在Python中运行脚本的路径?

我的script.py在脚本的相同目录中创建一个临时文件。 运行时: python script.py 它只是文件 但运行时不起作用: python /path/to/script.py 这是因为我在script.py中使用了一个相对路径来存储临时文件,而不是绝对路径。 问题是我不知道它将在哪个路径上运行,所以我需要一种方法来知道这一点。 关于什么? os.path.abspath(os.path.dirname(__file__)) 根据伟大的Dive Into Python: import sys, os print 'sys.ar

Path to current file depends on how I execute the program

This is my Python program: #!/usr/bin/env python import os BASE_PATH = os.path.dirname(__file__) print BASE_PATH If I run this using python myfile.py it prints an empty string. If I run it using myfile.py , it prints the correct path. Why is this? I'm using Windows Vista and Python 2.6.2. It's just a harmless windows quirk; you can compensate by using os.path.abspath(__file__) ,

当前文件的路径取决于我如何执行程序

这是我的Python程序: #!/usr/bin/env python import os BASE_PATH = os.path.dirname(__file__) print BASE_PATH 如果我使用python myfile.py运行它,它会打印一个空字符串。 如果我使用myfile.py运行它,它将打印正确的路径。 为什么是这样? 我使用的是Windows Vista和Python 2.6.2。 这只是一个无害的窗口怪癖; 您可以使用os.path.abspath(__file__)进行补偿,请参阅文档 os.path.normpath(os.path.join(os.getcwd

Find path to currently running file

How can I find the full path to the currently running Python script? That is to say, what do I have to put to achieve this: Nirvana@bahamut:/tmp$ python baz.py running from /tmp file is baz.py __file__ is NOT what you are looking for. Don't use accidental side-effects sys.argv[0] is always the path to the script (if in fact a script has been invoked) -- see http://docs.python.org/libra

查找当前运行的文件的路径

我如何找到当前正在运行的Python脚本的完整路径? 也就是说,为了实现这个目标,我需要做些什么: Nirvana@bahamut:/tmp$ python baz.py running from /tmp file is baz.py __file__不是你正在寻找的。 不要使用意外的副作用 sys.argv[0] 始终是脚本的路径(如果实际上已经调用脚本) - 请参阅http://docs.python.org/library/sys.html#sys.argv __file__是当前正在执行的文件(脚本或模块)的路径。 如果从脚本访问它