吾爱破解 - 52pojie.cn

 找回密码
 注册[Register]

QQ登录

只需一步,快速开始

查看: 6251|回复: 62
收起左侧

[Web逆向] 取蓝奏云直链教程2(含密码)(附python源码)

  [复制链接]
baipiao520 发表于 2024-3-22 23:33
本帖最后由 baipiao520 于 2024-3-23 09:47 编辑

前情回顾

第一期传送门:取蓝奏云直链教程(附python源码)
上次我们分析了一个无访问密码的单文件分享链接。
上次主要运用了re库,也就是正则表达式来取出网页中的参数,有人推荐我用bs4来提取参数,但是bs4不太适用于JavaScript,所以本文还是使用正则来提取。

准备工作

浏览器
python环境

开始分析

有了上次的经验,我们直接访问一个带密码的分享链接并查看浏览器
未输入密码前:
{6BA127C9-DADD-4b44-8389-5932C02CC59C}.png
输入密码后:
{87DA92C7-D10E-4047-9966-5BCCAC340892}.png
同时查看网络调试:
{EE111C16-A9BB-4b87-93B7-0020085C3853}.png
发现这次的网页反而没有套娃式请求,而是一步到位。
那我们也直接开始request。

import requests
url = "https://wwt.lanzouu.com/iW5jF1s99k6j"
password = 6666
headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
response = requests.get(url, headers=headers)
print(response.text)

我们观察取回来的网页,发现和上次很相似
{1BA179F7-93F2-4d2d-AE09-198B56D65AE3}.png
{E86F61BA-8CF6-4e4a-9DB8-352431617C51}.png
只不过这次的ajax脚本在一个down_p()函数中
相比上次还节省了好几步
那我们直接开始提取参数吧!

url_pattern = re.compile(r"url\s*:\s*'(/ajaxm\.php\?file=\d+)'")
url_match = url_pattern.search(response.text).group(1)
skdklds_pattern = re.compile(r"var\s+skdklds\s*=\s*'([^']*)';")
skdklds_match = skdklds_pattern.search(response.text).group(1)
print(url_match, skdklds_match)

考虑到Match类型我们只需要用到group(1)方法,这次我在定义变量时就直接使用group(1)方法,也方便后续调用。文末我会给出这次和上次的优化后的代码。
接下来是模拟post请求

data = {
    'action': 'downprocess',
    'sign': skdklds_match,
    'p': password,
}
headers = {
    "Referer": url,
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
}
response2 = requests.post(f"https://{re_domain(url)}{url_match}", headers=headers, data=data)
print(response2.text)

password可以是str类型,也可以是int类型,因为在转换为data时都会自动转为str类型,这里看个人喜好。
有了上次的教训,别忘记在协议头中加入Referer。
后面就和之前的一模一样了

import json
data = json.loads(response2.text)
dom = data['dom']
url = data['url']
full_url = dom + "/file/" + url
headers = {
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"accept-language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6",
"sec-ch-ua": "\"Chromium\";v=\"122\", \"Not(A:Brand\";v=\"24\", \"Microsoft Edge\";v=\"122\"",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "\"Windows\"",
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
"cookie": "down_ip=1"
}
response3 = requests.get(full_url, headers=headers, allow_redirects=False)
print(response3.headers['Location'])

完整程序(带密码)

import requests
import re
import json
def re_domain(url):
    pattern_domain = r"https?://([^/]+)"
    match = re.search(pattern_domain, url)
    if match:
        domain = match.group(1)
        return domain
    else:
        return None
url = "https://wwt.lanzouu.com/iW5jF1s99k6j"
password = "6666"
headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
response = requests.get(url, headers=headers)
url_pattern = re.compile(r"url\s*:\s*'(/ajaxm\.php\?file=\d+)'")
url_match = url_pattern.search(response.text).group(1)
skdklds_pattern = re.compile(r"var\s+skdklds\s*=\s*'([^']*)';")
skdklds_match = skdklds_pattern.search(response.text).group(1)
print(url_match, skdklds_match)
data = {
    'action': 'downprocess',
    'sign': skdklds_match,
    'p': password,
}
headers = {
    "Referer": url,
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
}
response2 = requests.post(f"https://{re_domain(url)}{url_match}", headers=headers, data=data)
data = json.loads(response2.text)
dom = data['dom']
url = data['url']
full_url = dom + "/file/" + url
headers = {
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"accept-language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6",
"sec-ch-ua": "\"Chromium\";v=\"122\", \"Not(A:Brand\";v=\"24\", \"Microsoft Edge\";v=\"122\"",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "\"Windows\"",
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
"cookie": "down_ip=1"
}
response3 = requests.get(full_url, headers=headers, allow_redirects=False)
print(response3.headers['Location'])

如何区分是否需要密码

其实这两个网页的区别还是很大的,也有很多方法可以区分:

  1. 有密码的<title>文件</title>,没密码的<title>文件名 - 蓝奏云</title>
    具体方法
    if "<title>文件</title>" in response.text:
    print("包含密码")
    else:
    print("无密码")
  2. 有密码的包含很多的<style>,没密码的没有
    具体方法
    if "<style>" in response.text:
    print("包含密码")
    else:
    print("无密码")
  3. 还有后面的很多函数都是只有有密码的才有,这里就不做演示了。

完整程序

import requests
import re
import json
def re_domain(url):
    pattern_domain = r"https?://([^/]+)"
    match = re.search(pattern_domain, url)
    if match:
        domain = match.group(1)
        return domain
    else:
        return None
def getwithp(url, password):
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
    response = requests.get(url, headers=headers)
    url_pattern = re.compile(r"url\s*:\s*'(/ajaxm\.php\?file=\d+)'")
    url_match = url_pattern.search(response.text).group(1)
    skdklds_pattern = re.compile(r"var\s+skdklds\s*=\s*'([^']*)';")
    skdklds_match = skdklds_pattern.search(response.text).group(1)
    data = {
        'action': 'downprocess',
        'sign': skdklds_match,
        'p': password,
    }
    headers = {
        "Referer": url,
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
    response2 = requests.post(f"https://{domain}{url_match}", headers=headers, data=data)
    data = json.loads(response2.text)
    full_url = data['dom'] + "/file/" + data['url']
    headers = {
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
    "accept-language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6",
    "sec-ch-ua": "\"Chromium\";v=\"122\", \"Not(A:Brand\";v=\"24\", \"Microsoft Edge\";v=\"122\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"Windows\"",
    "sec-fetch-dest": "document",
    "sec-fetch-mode": "navigate",
    "sec-fetch-site": "none",
    "sec-fetch-user": "?1",
    "upgrade-insecure-requests": "1",
    "cookie": "down_ip=1"
    }
    response3 = requests.get(full_url, headers=headers, allow_redirects=False)
    return response3.headers['Location']
def getwithoutp(url):
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
    response = requests.get(url, headers=headers)
    iframe_pattern = re.compile(r'<iframe\s+class="ifr2"\s+name="\d+"\s+src="([^"]+)"\s+frameborder="0"\s+scrolling="no"></iframe>')
    matches = iframe_pattern.findall(response.text)
    response2 = requests.get(f"https://{domain}{matches[1]}", headers=headers)
    pattern = r"'sign'\s*:\s*'([^']+)'"
    sign = re.search(pattern, response2.text).group(1)
    pattern2 = r"url\s*:\s*'([^']+)'"
    url2 = re.search(pattern2, response2.text).group(1)
    data = {
        'action': 'downprocess',
        'signs': '?ctdf',
        'sign': sign,
        'websign': '',
        'websignkey': 'bL27',
        'ves': 1
    }
    headers = {
        "Referer": matches[1],
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
    response3 = requests.post(f"https://{domain}{url2}", headers=headers, data=data)
    data = json.loads(response3.text)
    full_url = data['dom'] + "/file/" + data['url']
    headers = {
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
    "accept-language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6",
    "sec-ch-ua": "\"Chromium\";v=\"122\", \"Not(A:Brand\";v=\"24\", \"Microsoft Edge\";v=\"122\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"Windows\"",
    "sec-fetch-dest": "document",
    "sec-fetch-mode": "navigate",
    "sec-fetch-site": "none",
    "sec-fetch-user": "?1",
    "upgrade-insecure-requests": "1",
    "cookie": "down_ip=1"
    }
    response4 = requests.get(full_url, headers=headers, allow_redirects=False)
    return response4.headers['Location']
url = "https://wwt.lanzouu.com/iW5jF1s99k6j"
password = "6666"
domain = re_domain(url)
headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
response = requests.get(url, headers=headers)
if "<title>文件</title>" in response.text:
    print("包含密码")
    result = getwithp(url, password)
else:
    print("无密码")
    result = getwithoutp(url)
print(result)

结语

本教程仅供参考学习思路,网页随时会变,并非永久可用。
多文件(文件夹)分享下期再讲。

免费评分

参与人数 22吾爱币 +25 热心值 +20 收起 理由
zif1993 + 1 + 1 我很赞同!
小章大大呀 + 1 + 1 用心讨论,共获提升!
silenter6speake + 1 欢迎分析讨论交流,吾爱破解论坛有你更精彩!
010000 + 1 + 1 高质量的帖子,提神醒脑
YYL7535 + 1 + 1 谢谢@Thanks!
meder + 1 + 1 我很赞同!
MissYan + 1 用心讨论,共获提升!
snakey2k + 1 + 1 谢谢@Thanks!
gym66777 + 1 + 1 谢谢@Thanks!
kuiur0810 + 1 + 1 我很赞同!
wocaitou + 1 + 1 鼓励转贴优秀软件安全工具和文档!
zz8283 + 1 感谢教程
janken + 1 + 1 热心回复!
bubu5617 + 1 + 1 用心讨论,共获提升!
zermx + 1 共获提升!
涛之雨 + 7 + 1 欢迎分析讨论交流,吾爱破解论坛有你更精彩!
lin_xop + 1 + 1 热心回复!
WAJP + 1 用心讨论,共获提升!
安道尔的鱼 + 1 + 1 热心回复!
jiafei5331 + 1 + 1 我很赞同!
wzvideni + 1 + 1 我很赞同!
EHOOD + 1 + 1 感谢教程

查看全部评分

发帖前要善用论坛搜索功能,那里可能会有你要找的答案或者已经有人发布过相同内容了,请勿重复发帖。

wzvideni 发表于 2024-3-24 09:33
大佬,我照着你的教程打算自己试一下带密码的文件夹形式的蓝奏云链接,目前已经能把输入密码后的那个界面的json数据给请求出来了,但是请求具体文件时返回为空,不知道是怎么回事,在网页端进入文件夹输入一次密码后再点击具体文件时是不需要输入单个文件的密码的,不知道是不是这个原因,但是我加上Referer头也一样。

大佬如果有时间的话可以看一下吗

代码如下:
[Python] 纯文本查看 复制代码
import json
import re

import requests


def re_domain(url):
    pattern_domain = r"https?://([^/]+)"
    match = re.search(pattern_domain, url)
    if match:
        domain = match.group()
        return domain
    else:
        return None


url = "https://wwur.lanzout.com/b01rs66mb"
password = "xfgc"
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
}
response = requests.get(url, headers=headers)
url_match = re.search(r"url\s*:\s*'(/filemoreajax\.php\?file=\d+)'", response.text).group(1)
file_match = re.search(r"\d+", url_match).group()
t_match = re.search(r"var\s+ib\w+\s*=\s*'([^']*)';", response.text).group(1)
k_match = re.search(r"var\s+_h\w+\s*=\s*'([^']*)';", response.text).group(1)

print(url_match)
print(file_match)
print(t_match)
print(k_match)
# print(response.text)
data = {
    'lx': 2,
    'fid': file_match,
    'uid': '1674564',
    'pg': 1,
    'rep': '0',
    't': t_match,
    'k': k_match,
    'up': 1,
    'ls': 1,
    'pwd': password
}
headers = {
    "Referer": url,
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
}

print(f"{re_domain(url)}{url_match}")

response2 = requests.post(f"{re_domain(url)}{url_match}", headers=headers, data=data)
# print(response2.text)
data = json.loads(response2.text)
# print(data)
text_list = data['text']

headers = {
    "Referer": url,
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
    "accept-language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6",
    "sec-ch-ua": "\"Chromium\";v=\"122\", \"Not(A:Brand\";v=\"24\", \"Microsoft Edge\";v=\"122\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"Windows\"",
    "sec-fetch-dest": "document",
    "sec-fetch-mode": "navigate",
    "sec-fetch-site": "none",
    "sec-fetch-user": "?1",
    "upgrade-insecure-requests": "1",
    "cookie": "down_ip=1"
}

for text in text_list:
    print(text['name_all'])
    file_url = f"{re_domain(url)}/{text['id']}"
    print(file_url)
    response3 = requests.get(file_url, headers=headers, allow_redirects=False)
    print(response3)
    print(response3.text)
    # print(response3.headers['Location'])

    break

免费评分

参与人数 1吾爱币 +1 热心值 +1 收起 理由
baipiao520 + 1 + 1 热心回复!

查看全部评分

 楼主| baipiao520 发表于 2024-3-24 10:36
wzvideni 发表于 2024-3-24 09:33
大佬,我照着你的教程打算自己试一下带密码的文件夹形式的蓝奏云链接,目前已经能把输入密码后的那个界面的 ...

你最后取回的file_url其实就是我第一篇里面的不带密码的访问,所以可以直接调用我的函数
[Python] 纯文本查看 复制代码
import json
import re
import requests

def re_domain(url):
    pattern_domain = r"https?://([^/]+)"
    match = re.search(pattern_domain, url)
    if match:
        domain = match.group()
        return domain
    else:
        return None

def getwithoutp(url):
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
    response = requests.get(url, headers=headers)
    iframe_pattern = re.compile(r'<iframe\s+class="ifr2"\s+name="\d+"\s+src="([^"]+)"\s+frameborder="0"\s+scrolling="no"></iframe>')
    matches = iframe_pattern.findall(response.text)
    response2 = requests.get(f"{domain}{matches[1]}", headers=headers)
    pattern = r"'sign'\s*:\s*'([^']+)'"
    sign = re.search(pattern, response2.text).group(1)
    pattern2 = r"url\s*:\s*'([^']+)'"
    url2 = re.search(pattern2, response2.text).group(1)
    data = {
        'action': 'downprocess',
        'signs': '?ctdf',
        'sign': sign,
        'websign': '2',
        'websignkey': 'xLG2',
        'ves': 1
    }
    headers = {
        "Referer": matches[1],
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
    }
    response3 = requests.post(f"{domain}{url2}", headers=headers, data=data)
    data = json.loads(response3.text)
    full_url = str(data['dom']) + "/file/" + str(data['url'])
    headers = {
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
    "accept-language": "zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6",
    "sec-ch-ua": "\"Chromium\";v=\"122\", \"Not(A:Brand\";v=\"24\", \"Microsoft Edge\";v=\"122\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"Windows\"",
    "sec-fetch-dest": "document",
    "sec-fetch-mode": "navigate",
    "sec-fetch-site": "none",
    "sec-fetch-user": "?1",
    "upgrade-insecure-requests": "1",
    "cookie": "down_ip=1"
    }
    response4 = requests.get(full_url, headers=headers, allow_redirects=False)
    return response4.headers['Location']
url = "https://wwur.lanzout.com/b01rs66mb"
password = "xfgc"
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0"
}
response = requests.get(url, headers=headers)
url_match = re.search(r"url\s*:\s*'(/filemoreajax\.php\?file=\d+)'", response.text).group(1)
file_match = re.search(r"\d+", url_match).group()
t_match = re.search(r"var\s+ib\w+\s*=\s*'([^']*)';", response.text).group(1)
k_match = re.search(r"var\s+_h\w+\s*=\s*'([^']*)';", response.text).group(1)
domain = re_domain(url)
print(url_match)
print(file_match)
print(t_match)
print(k_match)
# print(response.text)
data = {
    'lx': 2,
    'fid': file_match,
    'uid': '1674564',
    'pg': 1,
    'rep': '0',
    't': t_match,
    'k': k_match,
    'up': 1,
    'ls': 1,
    'pwd': password
}
print(f"{domain}{url_match}")
response2 = requests.post(f"{domain}{url_match}", headers=headers, data=data)
# print(response2.text)
data = json.loads(response2.text)
# print(data)
text_list = data['text']
for text in text_list:
    print(text['name_all'])
    print(text)
    file_url = f"{domain}/{text['id']}"
    print(file_url)
    print(getwithoutp(file_url))
    break

免费评分

参与人数 2吾爱币 +2 热心值 +2 收起 理由
MisterLee + 1 + 1 鼓励转贴优秀软件安全工具和文档!
wzvideni + 1 + 1 感谢感谢,第一篇我没咋看,我还以为没关系呢

查看全部评分

m96118 发表于 2024-3-23 07:12
sai609 发表于 2024-3-23 07:29
一,蓝奏云,浏览器自带下载工具即可,秒下
二,123,天翼,度盘,夸克,阿里盘:直链提取后,免注册登陆而下载,有啥办法
PS 不用自己账号,怕封
tsanye 发表于 2024-3-23 07:30
谢谢&#128591;分享,学习
jm1jm1 发表于 2024-3-23 07:45

讲解的很详细,谢谢 分享,慢慢消化中
shallies 发表于 2024-3-23 07:56
学习了,感谢楼主技术分享
头像被屏蔽
saccsf 发表于 2024-3-23 08:06
提示: 作者被禁止或删除 内容自动屏蔽
BBA119 发表于 2024-3-23 08:06
能不能讲讲这个的意义    讲解的很详细,看不懂    谢谢 分享,
WJayden 发表于 2024-3-23 08:31
学习了,挺详细的
anchovy126 发表于 2024-3-23 08:31
谢谢分享,值得学习
您需要登录后才可以回帖 登录 | 注册[Register]

本版积分规则

返回列表

RSS订阅|小黑屋|处罚记录|联系我们|吾爱破解 - LCG - LSG ( 京ICP备16042023号 | 京公网安备 11010502030087号 )

GMT+8, 2024-12-15 10:05

Powered by Discuz!

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表