这个模型的质量还是非常不错的。 现在想直接在线试用的话比较方便了。
模型支持流式输出,使用方式我做了一些改动, 更接近于 Claude 或 stable-lm, 需要按照特定格式自行组织多轮对话的 prompt 。 如果不按多轮对话格式组织 prompt ,那就是一般的 completion 方式运行。
Replicate 平台上 GPU 用得 A100 40G, 模型 FP32 满血运行。
个人主观感受 FP32 的质量比默认的 FP16 要好。
Docker 镜像有 30G ,如果需要冷启动的话,冷启动时间是 5 分钟,用的人多了之后才会比较舒适。
运行地址 https://replicate.com/nomagick/chatglm2-6b
Cog 源码 https://github.com/nomagick/ChatGLM2-6B-cog
原模型 https://github.com/THUDM/ChatGLM2-6B
注意我只是开源开发者,和原模型与 Replicate 均无一分钱关系,模型在 Replicate 上运行我也不会有任何收入。 原模型只授权了研究用途。
1
dvbs2000 2023-07-02 13:57:19 +08:00
提示这个:
模型启动有时可能需要大约 3 到 5 分钟。如果您想详细了解为什么会发生这种情况,请查看我们的复制工作原理指南中有关冷启动的部分。 是不是每个人使用都需要冷启动 |
2
nomagick OP @dvbs2000 你启动完了下一个人就不用冷启动了,但如果一段时间没人调用的话他就会 scale to 0 ,再下一个人就又需要冷启动了
|
3
dvbs2000 2023-07-02 14:18:10 +08:00
测了一个标准的英语高考完形填空 正确率 40% 。bard50% gpt4 95-100% 国内别的几个模型基本上都不到 30% 。已经算不错了 阅读下面短文,从短文后各题所给的 A 、B 、C 和 D 四个选项中,选出可以填入空白处的最佳选项。 题目是从 41-60 题,共 20 道题
I quietly placed my ear against the kitchen door. Mom had a male 41 ! I peeked(偷看) around. Sitting there was a gentleman, the most handsome man I’d 42 seen. Mom was a young widow then with three children. My sister was ten, my brother four and I six. I 43 having a daddy. And I knew he was the one. Then I marched right into the 44 . “Hi! I’m Patty. What’s your name?” “George.” Looking towards Mom, I asked, “Don’t you think my mom’s pretty?” “Patty!” Mom scolded with 45 . “Go and check on Benny.” George leaned forward and 46 , “Yes, I do. I’ll see you later, Patty. I think we will be good friends.” George started 47 Mom more often. He always seemed happy to see me and never grew 48 of my endless questions. Soon they entered into a 49 . For George who’d never been married before, coming back from World War II and into a ready-made family took some 50 . One evening was especially bad. Benny was crying on the kitchen floor. Annie was 51 loudly it wasn’t her place to 52 that spoiled child. And I spilled a whole pot of butter milk. With a(n) 53 look, George muttered(嘟囔), “I must have been 54 to marry a woman with three kids.” Mom fled to their bedroom in 55 , and George walked out. I hurried to the porch. “I’m sorry. I’ll be more careful next time. Please don’t 56 !” 57 wiping my tears, he said, “We’re friends, and friends never 58 the people they love. Don’t worry. I’ll always be here.” Then he went to 59 Mom. Over the years, George has always been there for me. I still turn to him with my 60 though he is 85. 41. A. volunteer B. visitor C. supporter D. scholar 42. A. ever B. always C. never D. seldom 43. A. recommended B. stopped C. missed D. minded 44. A. kitchen B. bathroom C. bedroom D. garden 45. A. excitement B. doubt C. embarrassment D. pride 46. A. yelled B. complained C. reported D. whispered 47. A. taking on B. calling on C. focusing on D. putting on 48. A. tired B. uncertain C. fond D. confident 49. A. conflict B. contact C. marriage D. competition 50. A. planning B. pretending C. adjusting D. misunderstanding 51. A. warning B. complaining C. wondering D. demanding 52. A. look after B. depend on C. stand for D. set up 53. A. exciting B. energetic C. curious D. vacant 54. A. talented B. mad C. brave D. unbelievable 55. A. shock B. vain C. tears D. ruins 56. A. leave B. refuse C. approach D. escape 57. A. Deeply B. Gently C. Properly D. Skillfully 58. A. betray B. force C. abandon D. threaten 59. A. persuade B. inform C. attract D. comfort 60. A. suggestions B. problems C. experiences D. achievements 完形填空(共 20 小题;每小题 1.5 分,满分 30 分) 41-45 BACAC 46-50 DBACC 51-55 BADBC 56-60 ABCDB |
4
hackpro 2023-07-03 02:37:58 +08:00 via iPhone
M2 max 推理运行速度怎样啊
|
6
pkoukk 2023-07-03 11:54:50 +08:00
测了一下我经常在 3.5 上用的角色扮演 prompt ,不甚理想,它甚至不能判断目前自己应该扮演的角色,老用我的身份发言。
|
8
nomagick OP @pkoukk 可能你的 prompt 太复杂了。 模型能力上肯定和一线模型没法比,毕竟资源消耗上也差着呢。 可以给他一些例子,few shot 试一下。
|
9
wangmou 2023-07-03 16:37:23 +08:00
6B 商业授权好像是百万级别,老哥们可别随便商用啊。
|
11
OPLUS 2023-09-04 15:48:19 +08:00
请问 op 是做了一些微调嘛,我自己也搭了一个 ChatGLM-6B (直接 streamlit run web_demo2.py ),输入同样的 prompt ,你搭建的 replicate 上的输出效果很不错,可是我这个输出效果很差
|