Python scrapy pipelines 里面的 item 怎么按里面的一个字段值排序？ - V2EX

首页注册登录

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

推荐学习书目

› Learn Python the Hard Way

Python Sites

› PyPI - Python Package Index

› http://diveintopython.org/toc/index.html

› Pocoo

值得关注的项目

› PyPy

› Celery

› Jinja2

› Read the Docs

› gevent

› pyenv

› Stackless Python

› Beautiful Soup

› 结巴中文分词

› Green Unicorn

› Sentry

› Shovel

› pytest

Python 编程

› pep8 Checker

Styles

› PEP 8

› Google Python Style Guide

› Code Style from The Hitchhiker's Guide

这是一个创建于 2789 天前的主题，其中的信息可能已经有所发展或是发生改变。

比如 item 里面有一个 infoid 的字段。item['infoid'] 对应的是一些数据。

怎么在 pipelines 里通过 item['infoid'] 对应的值给 item 排序后再让后面的 pipelines 处理它？

sorted(item.items(), key=lambda infoid:infoid[1])

这样排序后总是提示：TypeError: string indicesmust be integers, not str
不知道还有什么办法可以在 pipelines 里存入数据库前给 item 按相应的字段值排序后再处理？

5 条回复 • 2017-05-11 10:01:08 +08:00

1

knightdf

2017-05-10 19:17:06 +08:00

pipeline 处理 item 应该是无序的，只是 pipeline 有权重高低顺序之分

2

dsg001

2017-05-10 19:38:05 +08:00

使用 orderdict 排序吧

3

zsz

2017-05-10 19:53:22 +08:00

pipelines 处理数据本身就是根据数据获取的顺序处理（流式），如果抓取的数据比较少，可以缓存到 cache 中，最后排序入库，不然还是直接入库，用 infoid 建立一个索引字段

4

freestyle

2017-05-11 09:59:46 +08:00

sorted(item.items(), key=lambda i:i["infoid"])

5

freestyle

2017-05-11 10:01:08 +08:00

4#回复错了
data = item.items()
sorted(data, key=lambda i:i["infoid"])

关于 · 帮助文档 · 博客 · API · FAQ · 实用小工具 · 2541 人在线 最高记录 6679 ·

Select Language

创意工作者们的社区

World is powered by solitude

VERSION: 3.9.8.5 · 26ms · UTC 11:07 · PVG 19:07 · LAX 03:07 · JFK 06:07
Developed with CodeLauncher
♥ Do have faith in what you're doing.