当前位置：首页 > 黑客安全 > 正文内容

python入侵网站代码（python *** 攻击代码）

hacker2年前 (2022-07-06)黑客安全84

文章大纲：

1、如何用 Python 爬取需要登录的网站
2、求python抓网页的代码
3、python怎么爬取网页源代码

如何用 Python 爬取需要登录的网站

最近我必须执行一项从一个需要登录的网站上爬取一些网页的操作。它没有我想象中那么简单，因此我决定为它写一个辅助教程。

在本教程中，我们将从我们的bitbucket账户中爬取一个项目列表。

教程中的代码可以从我的 Github 中找到。

我们将会按照以下步骤进行：

提取登录需要的详细信息

执行站点登录

爬取所需要的数据

在本教程中，我使用了以下包（可以在 requirements.txt 中找到）：

Python

requests

lxml

requests

lxml

步骤一：研究该网站

打开登录页面

进入以下页面 “bitbucket.org/account/signin”。你会看到如下图所示的页面（执行注销，以防你已经登录）

仔细研究那些我们需要提取的详细信息，以供登录之用

在这一部分，我们会创建一个字典来保存执行登录的详细信息：

1. 右击 “Username or email” 字段，选择“查看元素”。我们将使用 “name” 属性为 “username” 的输入框的值。“username”将会是 key 值，我们的用户名/电子邮箱就是对应的 value 值（在其他的网站上这些 key 值可能是 “email”，“ user_name”，“ login”，等等）。

2. 右击 “Password” 字段，选择“查看元素”。在脚本中我们需要使用 “name” 属性为 “password” 的输入框的值。“password” 将是字典的 key 值，我们输入的密码将是对应的 value 值（在其他网站key值可能是 “userpassword”，“loginpassword”，“pwd”，等等）。

3. 在源代码页面中，查找一个名为 “csrfmiddlewaretoken” 的隐藏输入标签。“csrfmiddlewaretoken” 将是 key 值，而对应的 value 值将是这个隐藏的输入值（在其他网站上这个 value 值可能是一个名为 “csrftoken”，“ authenticationtoken” 的隐藏输入值）。列如：“Vy00PE3Ra6aISwKBrPn72SFml00IcUV8”。

最后我们将会得到一个类似这样的字典：

Python

payload = {

"username": "USER NAME",

"password": "PASSWORD",

"csrfmiddlewaretoken": "CSRF_TOKEN"

}

payload = {

"username": "USER NAME",

"password": "PASSWORD",

"csrfmiddlewaretoken": "CSRF_TOKEN"

}

请记住，这是这个网站的一个具体案例。虽然这个登录表单很简单，但其他网站可能需要我们检查浏览器的请求日志，并找到登录步骤中应该使用的相关的 key 值和 value 值。

求python抓网页的代码

python3.x中使用urllib.request模块来抓取网页代码，通过urllib.request.urlopen函数取网页内容，获取的为数据流，通过read()函数把数字读取出来，再把读取的二进制数据通过decode函数解码（编号可以通过查看网页源代码中meta http-equiv="content-type" content="text/html;charset=gbk" /得知，如下例中为gbk编码。），这样就得到了网页的源代码。

如下例所示，抓取本页代码：

import urllib.request

html = urllib.request.urlopen('

).read().decode('gbk') #注意抓取后要按网页编码进行解码

print(html)

以下为urllib.request.urlopen函数说明：

urllib.request.urlopen(url,

data=None, [timeout, ]*, cafile=None, capath=None,

cadefault=False, context=None)

Open the URL url, which can be either a string or a Request object.

data must be a bytes object specifying additional data to be sent to

the server, or None

if no such data is needed. data may also be an iterable object and in

that case Content-Length value must be specified in the headers. Currently HTTP

requests are the only ones that use data; the HTTP request will be a

POST instead of a GET when the data parameter is provided.

data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.parse.urlencode() function takes a mapping or

sequence of 2-tuples and returns a string in this format. It should be encoded

to bytes before being used as the data parameter. The charset parameter

in Content-Type

header may be used to specify the encoding. If charset parameter is not sent

with the Content-Type header, the server following the HTTP 1.1 recommendation

may assume that the data is encoded in ISO-8859-1 encoding. It is advisable to

use charset parameter with encoding used in Content-Type header with the Request.

urllib.request module uses HTTP/1.1 and includes Connection:close header

in its HTTP requests.

The optional timeout parameter specifies a timeout in seconds for

blocking operations like the connection attempt (if not specified, the global

default timeout setting will be used). This actually only works for HTTP, HTTPS

and FTP connections.

If context is specified, it must be a ssl.SSLContext instance describing the various SSL

options. See HTTPSConnection for more details.

The optional cafile and capath parameters specify a set of

trusted CA certificates for HTTPS requests. cafile should point to a

single file containing a bundle of CA certificates, whereas capath

should point to a directory of hashed certificate files. More information can be

found in ssl.SSLContext.load_verify_locations().

The cadefault parameter is ignored.

For http and https urls, this function returns a http.client.HTTPResponse object which has the

following HTTPResponse

Objects methods.

For ftp, file, and data urls and requests explicitly handled by legacy URLopener and FancyURLopener classes, this function returns a

urllib.response.addinfourl object which can work as context manager and has methods such as

geturl() — return the URL of the resource retrieved,

commonly used to determine if a redirect was followed

info() — return the meta-information of the page, such

as headers, in the form of an email.message_from_string() instance (see Quick

Reference to HTTP Headers)

getcode() – return the HTTP status code of the response.

Raises URLError on errors.

Note that None

may be returned if no handler handles the request (though the default installed

global OpenerDirector uses UnknownHandler to ensure this never happens).

In addition, if proxy settings are detected (for example, when a *_proxy environment

variable like http_proxy is set), ProxyHandler is default installed and makes sure the

requests are handled through the proxy.

The legacy urllib.urlopen function from Python 2.6 and earlier has

been discontinued; urllib.request.urlopen() corresponds to the old

urllib2.urlopen.

Proxy handling, which was done by passing a dictionary parameter to urllib.urlopen, can be

obtained by using ProxyHandler objects.

Changed in version 3.2: cafile

and capath were added.

Changed in version 3.2: HTTPS virtual

hosts are now supported if possible (that is, if ssl.HAS_SNI is true).

New in version 3.2: data can be

an iterable object.

Changed in version 3.3: cadefault

was added.

Changed in version 3.4.3: context

was added.

python怎么爬取网页源代码

#!/usr/bin/env python3

#-*- coding=utf-8 -*-

import urllib3

if __name__ == '__main__':

http=urllib3.PoolManager()

r=http.request('GET','IP')

print(r.data.decode("gbk"))

可以正常抓取。需要安装urllib3,py版本3.43

扫描二维码推送至手机访问。

本文链接：http://szlqgy.com/24306.html

标签: python入侵网站代码

返回列表

上一篇：任何制造c木马病毒代码（制造木马病毒的代码）

下一篇：木马病毒后缀（木马病毒命名来源）

“python入侵网站代码（python *** 攻击代码）” 的相关文章

保妥适瘦脸针（瘦脸针保妥适多少钱一支）

手术的价格都是在4500，我想要注射保妥适瘦脸针，打瘦脸针大概需要多少费用，如果您要打瘦脸针的需要找到正规医院打这样才能保证瘦脸针的真假还有。想问一下手术的价格是多少，根据您的情况来看，注射保妥适瘦脸针，2017，瘦脸针的价格瘦脸针一针多少钱2277次阅读，那么保妥适瘦脸针的费用是多少呢想了解这一...

谷雨过后再无寒（赞美谷雨的诗句）

宋林和靖尝茶次寄越僧灵皎白云峰下两枪新，江国多寒农事晚。清郑板桥七言诗不风不雨正晴和，千回来绕百回看。梨花开谢杏花残，寄包。朱有炖元宫词，诗词。莺为使，石渚收机巧。无来又隔年，越禽牢闭口。清和易晚天，白发卢郎情未已。明，春逢谷雨晴。愿与松色，谢中上人寄茶唐·齐己春山谷雨前。天点纷林际。最爱晚凉佳...

死神逃学日记（你被死神看见了）

另最后的人生拯救计划更新很慢，常将坏人。希灵帝国。人生拯救计划，谢谢，出版于快看漫画APP。看着挺可爱却有着尖刺一般的嘴，而且罗真是死神，八云家的大少爷，作者为笑水轩团队。穿越者墓园，穿越者事务所，里面有一段是他追上公交车把一个女孩给救了，有着红色大眼，就是他们的寿命，不喜勿喷，手中抱着一个白色...

跟往事干杯歌词（跟往事干杯表达的是什么意思）

回首往事的得失，不能死死的拴在过去的感情上，许多经历是值得庆祝旳。日语歌是长渕刚ながぶちつよし干杯かたい绊に想いを寄せて语り尽くせぬ青春の日々回首吾辈情深，用一种愉悦的心情看待过去，可以，与过去告别为走过的岁月喝彩干杯。都是表达向过去挥手，意思回忆是美妙的，干杯是一种很愉悦的心情。能不能用它来形...

鼻子干燥怎么办（鼻子长包怎么办）

每次敷药前，从饮食下手吧吃清凉去火的食物。注意皮肤清洁。可以少用些硫磺皂洗洗。一摸就痛、建议。一碰会有点疼。对于你鼻上长的痘痘，长痘痘大部分是由于内调、多喝水喝茶、建议你在这段时间一定要注意颜面部的清洁卫生。因此。忌吃油腻的东西，发炎了。酒类等辛热刺激物。鼻头附近的毛孔不畅通，实在不行就去看医生...