没找到原因, 大佬们帮忙看看
项目使用 Django
+ uwsgi
+ Nginx
uwsgi
[uwsgi]
pythonpath=/usr/local/server
chdir=/home/server
env=DJANGO_SETTINGS_MODULE=conf.settings
module=server.wsgi
master=True
pidfile=logs/server.pid
vacuum=True
max-requests=1000
enable-threads=true
processes = 4
threads=8
listen=1024
daemonize=logs/wsgi.log
http=0.0.0.0:16020
buffer-size=32000
有一个从服务端获取 excel 文件的请求,请求全量 40 多万条数
响应时间过长, 然后浏览器得到Nginx 504
但是一个子进程的内存一直彪升
top -p 866
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
866 soe 20 0 7059m 5.3g 5740 S 100.8 33.9 4:17.34 uwsgi --ini /home/smb/work/soe_server
基本代码
from openpyxl.writer.excel import save_virtual_workbook
from django.http import HttpResponse
...
class ExcelTableObj(object):
def __init__(self, file_name=None):
if file_name:
self.file_name = file_name
self.wb = load_workbook(file_name)
else:
self.wb = Workbook()
def create_new_sheet(self, title='Sheet1'):
new_ws = self.wb.create_sheet(title=title)
def write_to_sheet(self, sheetname, datas, filename):
ws = self.wb[sheetname]
for data in datas:
ws.append(data)
self.wb.save(filename)
def update_sheet_name(self, sheetname):
ws = self.wb.active
ws.title = sheetname
def append_data_to_sheet(self, sheetname, data):
ws = self.wb[sheetname]
ws.append(data)
def save_file(self, file_name):
self.wb.save(file_name)
self.wb.close()
def get_upload_file_data(self, name=None):
if name:
ws = self.wb.get_sheet_by_name(name)
else:
ws = self.wb.worksheets[0]
rows = ws.max_row
cols = ws.max_column
file_data = []
fields = []
for i in range(1, cols+1):
cell = ws.cell(row=1, column=i)
if cell.value:
fields.append(cell.value.lower().strip())
for row in range(2, rows + 1):
row_data = {}
for j in range(len(fields)):
value = ws.cell(row=row, column=j+1).value
if value:
row_data[fields[j]] = str(value).strip()
if row_data:
file_data.append(row_data)
return file_data
def get_sheet_maxrow(self, name):
ws = self.wb.get_sheet_by_name(name)
rows = ws.max_row
return rows
def _get_download_data(datas):
for data in queryset :
...
item = [str(data.account_id),
ILLEGAL_CHARACTERS_RE.sub(r'', data.account_name) if data.account_name else data.account_name,
type, fb_aac_conf.FB_ACCOUNT_STATUS[data.account_status],
data.submitter, data.submit_time, data.confirmor, data.confirm_time,
fb_aac_conf.BATCH_STATUS[data.status], data.reason, data.entity_name, data.payment_name,
data.sale, data.ae_note, urgent
]
yield item
queryset = MyModel.objects.filter(...) # about `450k` rows
datas = _get_download_data(queryset)
excel = ExcelTableObj()
excel.update_sheet_name(sheetname)
excel.append_data_to_sheet(sheetname, title)
excel.write_to_sheet(sheetname, datas, filename)
excel.save_file(filename)
response = HttpResponse(save_virtual_workbook(excel.wb),
content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename={}'.format(filename)
请大家帮忙分洗一下问题原因。感谢
1
ch2 2021-03-19 17:49:01 +08:00
|
2
BeautifulSoap 2021-03-19 17:52:56 +08:00
openpyxl 别说导出了,大点的 excel 读一下就要十来秒
|
3
chenqh 2021-03-19 18:04:02 +08:00
导出 excel 一般不是生成一个地址,然后前端自己去下载吗?
|
4
no1xsyzy 2021-03-19 18:17:51 +08:00
……nginx 504 了以后子进程没退出吧(
根据官方的 benchmark 放大一下,40w 条需要大约 4 分钟? |
5
ryd994 2021-03-19 18:25:18 +08:00 via Android
为啥导出 xls,如果只是数据的话 CSV 不就行了
excel 可以直接打开 |
6
vegetableChick OP @ch2 卡住可以理解, 但是为什么内存占用会一直往上升啊
|
7
vegetableChick OP @BeautifulSoap 。。为什么内存会吃这么多 而且一直彪
|
8
vegetableChick OP @no1xsyzy 为什么进程内存一致彪升啊
|
9
superrichman 2021-03-19 21:17:01 +08:00
这几十万的数据在内存里被你复制了不知道多少次,用完了也不主动去释放,openpyxl 也没开 write_only 模式,内存不炸有鬼了,自己 debug 一步一步看吧
还有 ``` def _get_download_data(datas): for data in queryset : ... queryset = MyModel.objects.filter(...) datas = _get_download_data(queryset) ``` 你确定函数是这么传参数的? |
10
BeautifulSoap 2021-03-19 21:47:58 +08:00
不是 ls 的内容我还没注意到,作为犯过同样错误的人提醒下 lz,data 的复数形式依旧是 data,不是 datas
|
11
ericls 2021-03-19 21:54:16 +08:00 via iPhone
@BeautifulSoap data 的单数是 datum.
|
12
BeautifulSoap 2021-03-19 22:17:39 +08:00
@ericls 语言是一直在动态变化的,datum 实际用得并不多,data 同时做单数已经成了英语世界一个大家都接受的用法了
|
13
FindHao 2021-03-20 00:01:17 +08:00 via Android
我也有个辣鸡代码是这样的,最后的解决方案是暴力写了 crontab 重启自己的服务
|