{"componentChunkName":"component---src-templates-blog-detail-tsx","path":"/blog/2019-09-01-search-engine-blog","result":{"data":{"currentBlog":{"id":"97450b07-658b-5207-8216-1c7b9b51b115","frontmatter":{"thumbnail":"https://img.serverlesscloud.cn/2020114/1578988490344-v2-8b2cd2c5275aa2c5a3c5083a148a7a9f_1200x500.jpg","authors":["Anycodes"],"categories":["user-stories"],"date":"2019-09-01T00:00:00.000Z","title":"如何通过 Serverless 与自然语言处理，让搜索引擎「看」到你的博客","description":"Serverless 与自然语言处理结合的一个小应用","authorslink":["https://www.zhihu.com/people/liuyu-43-97"],"translators":null,"translatorslink":null,"tags":["个人博客","serverless"],"keywords":"Serverless 自然语言处理","outdated":null},"wordCount":{"words":106,"sentences":34,"paragraphs":34},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-09-01-search-engine-blog.md","fields":{"slug":"/blog/2019-09-01-search-engine-blog/","keywords":["serverless","云函数","keywords","serverlesscloud","summary"]},"html":"<p>自然语言的内容有很多，本文所介绍的自然语言处理部分是「文本摘要」和「关键词提取」。</p>\n<p>很多朋友会有自己的博客，在博客上发文章时，这些文章发出去后，有的很容易被搜索引擎检索，有的则很难。那么有没有什么方法，让搜索引擎对博客友好一些呢？这里有一个好方法 —— 那就是填写网页的 Description 还有 Keywords。</p>\n<p>但是每次都需要我们自己去填写，非常繁琐。这个过程能否自动化实现？本文将会通过 Python 的 jieba 和 snownlp 进行文本摘要和关键词提取的实现。</p>\n<h2 id=\"▎准备资源\"><a href=\"#%E2%96%8E%E5%87%86%E5%A4%87%E8%B5%84%E6%BA%90\" aria-label=\"▎准备资源 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>▎准备资源</h2>\n<p>下载以下资源：</p>\n<ul>\n<li><a href=\"https://github.com/fxsjy/jieba\">Python 中文分词组件</a></li>\n<li><a href=\"https://github.com/isnowfy/snownlp\">Simplified Chinese Text Processing</a></li>\n</ul>\n<p>下载完成后，新建文件夹，拷贝对应的文件：</p>\n<p><img src=\"https://img.serverlesscloud.cn/2020114/1578989071240-v2-515f13a706f4f66f54ca3f72175be79a_hd.jpg\" alt=\"拷贝对应文件\"></p>\n<p>拷贝之后，建立文件 index.py</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\"># -*- coding: utf8 -*-\nimport json\nimport jieba.analyse\nfrom snownlp import SnowNLP\n\n\ndef FromSnowNlp(text, summary_num):\n    s = SnowNLP(text)\n    return s.summary(summary_num)\n\n\ndef FromJieba(text, keywords_type, keywords_num):\n    if keywords_type == &quot;tfidf&quot;:\n        return jieba.analyse.extract_tags(text, topK=keywords_num)\n    elif keywords_type == &quot;textrank&quot;:\n        return jieba.analyse.textrank(text, topK=keywords_num)\n    else:\n        return None\n\n\ndef main_handler(event, context):\n    text = event[&quot;text&quot;]\n    summary_num = event[&quot;summary_num&quot;]\n    keywords_num = event[&quot;keywords_num&quot;]\n    keywords_type = event[&quot;keywords_type&quot;]\n\n    return {&quot;keywords&quot;: FromJieba(text, keywords_type, keywords_num),\n            &quot;summary&quot;: FromSnowNlp(text, summary_num)}</code></pre></div>\n<p>超简单的代码有没有！</p>\n<h2 id=\"▎上传文件\"><a href=\"#%E2%96%8E%E4%B8%8A%E4%BC%A0%E6%96%87%E4%BB%B6\" aria-label=\"▎上传文件 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>▎上传文件</h2>\n<p>在云函数 SCF 控制台上新建一个项目：</p>\n<p><img src=\"https://img.serverlesscloud.cn/2020114/1578989070418-v2-515f13a706f4f66f54ca3f72175be79a_hd.jpg\" alt=\"新建项目\"></p>\n<p><img src=\"https://img.serverlesscloud.cn/2020114/1578989071153-v2-515f13a706f4f66f54ca3f72175be79a_hd.jpg\" alt=\"新建项目2\"></p>\n<p>提交方法选择上传 zip：</p>\n<p>然后我们压缩文件，并改名为 index.zip：</p>\n<p><img src=\"https://img.serverlesscloud.cn/2020114/1578989070419-v2-515f13a706f4f66f54ca3f72175be79a_hd.jpg\" alt=\"压缩文件\"></p>\n<h2 id=\"▎测试\"><a href=\"#%E2%96%8E%E6%B5%8B%E8%AF%95\" aria-label=\"▎测试 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>▎测试</h2>\n<p>测试之前可以适当调整一下我们的配置：</p>\n<p><img src=\"https://img.serverlesscloud.cn/2020114/1578989070789-v2-515f13a706f4f66f54ca3f72175be79a_hd.jpg\" alt=\"配置\"></p>\n<p>然后进行 input 模板的输入：</p>\n<p><img src=\"https://img.serverlesscloud.cn/2020114/1578989070772-v2-515f13a706f4f66f54ca3f72175be79a_hd.jpg\" alt=\"模板输入\"></p>\n<p>模板可以是：</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">{\n  &quot;text&quot;: &quot;前来参观的人群络绎不绝。在“两弹历程馆”里……（略）”&quot;,\n  &quot;summary_num&quot;: 5,\n  &quot;keywords_num&quot;: 5,\n  &quot;keywords_type&quot;: &quot;tfidf&quot;\n}</code></pre></div>\n<p>然后点击测试：</p>\n<p><img src=\"https://img.serverlesscloud.cn/2020114/1578989070876-v2-515f13a706f4f66f54ca3f72175be79a_hd.jpg\" alt=\"测试\"></p>\n<h2 id=\"▎应用\"><a href=\"#%E2%96%8E%E5%BA%94%E7%94%A8\" aria-label=\"▎应用 permalink\" class=\"anchor\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>▎应用</h2>\n<p>至此，我们完成了简单的关键词提取功能和抽取式文本摘要过程。</p>\n<p>当然，这只是简单的抛砖引玉，因为摘要这里还有声称是文本摘要，而且抽取式摘要也可能会根据不同的文章类型，有着不同的特色方法，所以这里只是通过一个简单的 Demo 来实现一个小功能，帮助大家做一个简单的 SEO 优化。</p>\n<p>大家以后自己做博客的时候，可以增加 keywords 或者 description 字段，然后每次从 sql 获得文章数据的时候，将这两个部分放到 meta 中，会大大提高页面被索引的概率哦～！</p>\n<blockquote>\n<p><strong>传送门：</strong></p>\n<ul>\n<li>GitHub: <a href=\"https://github.com/serverless/serverless/blob/master/README_CN.md\">github.com/serverless</a> </li>\n<li>官网：<a href=\"https://serverless.com/\">serverless.com</a></li>\n</ul>\n</blockquote>\n<p>欢迎访问：<a href=\"https://serverlesscloud.cn/\">Serverless 中文网</a>，您可以在 <a href=\"https://serverlesscloud.cn/best-practice\">最佳实践</a> 里体验更多关于 Serverless 应用的开发！</p>","tableOfContents":"<ul>\n<li><a href=\"/blog/2019-09-01-search-engine-blog/#%E2%96%8E%E5%87%86%E5%A4%87%E8%B5%84%E6%BA%90\">▎准备资源</a></li>\n<li><a href=\"/blog/2019-09-01-search-engine-blog/#%E2%96%8E%E4%B8%8A%E4%BC%A0%E6%96%87%E4%BB%B6\">▎上传文件</a></li>\n<li><a href=\"/blog/2019-09-01-search-engine-blog/#%E2%96%8E%E6%B5%8B%E8%AF%95\">▎测试</a></li>\n<li><a href=\"/blog/2019-09-01-search-engine-blog/#%E2%96%8E%E5%BA%94%E7%94%A8\">▎应用</a></li>\n</ul>"},"previousBlog":{"id":"ae4fd2f8-515c-5aec-b584-38427ef33f7e","frontmatter":{"thumbnail":"https://img.serverlesscloud.cn/2020114/1578989800047-part-00492-780.jpg","authors":["Anycodes"],"categories":["guides-and-tutorials","user-stories"],"date":"2019-09-16T00:00:00.000Z","title":"突破传统 OJ 瓶颈，「判题姬」接入云函数","description":"通过 Serverless 实现在线编程","authorslink":["https://www.zhihu.com/people/liuyu-43-97"],"translators":null,"translatorslink":null,"tags":["在线编程","云函数"],"keywords":"Serverless 在线编程,Serverless OJ","outdated":null},"wordCount":{"words":169,"sentences":30,"paragraphs":30},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-09-16-online-Judge.md","fields":{"slug":"/blog/2019-09-16-online-Judge/","keywords":["python","serverless","云函数","代码","函数","serverless"]}},"nextBlog":{"id":"a4c5d988-73c4-5ee7-84d9-02d548d13970","frontmatter":{"thumbnail":"https://img.serverlesscloud.cn/2020413/1586784466875-1585645264854-08f2789b7010bbfb.jpg","authors":["serverless 社区"],"categories":["meetup"],"date":"2019-08-28T00:00:00.000Z","title":"云函数开发者工具实操 - 直播课","description":"本次直播，腾讯云高级产品经理张远哲将分享 Serverless 的开发者工具建设","authorslink":["https://serverlesscloud.cn"],"translators":null,"translatorslink":null,"tags":["课程","serverless"],"keywords":"Serverless 全局变量组件,Serverless 单独部署组件,Serverless Component","outdated":null},"wordCount":{"words":121,"sentences":31,"paragraphs":31},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-08-28-scf-aarona-meetup.md","fields":{"slug":"/blog/2019-08-28-scf-aarona-meetup/","keywords":["nodejs","serverless","无服务器","无服务器架构","云函数","Serverless","Framework","函数","部署","serverless","npm","架构","install"]}},"recommendBlogs":{"edges":[{"node":{"id":"4300b21c-7209-5256-86ff-0d38e3daec9b","frontmatter":{"thumbnail":"https://main.qcloudimg.com/raw/14f1c8eed372e76c1b139703b2f6d0fa.jpg","authors":["KieranMcCarthy"],"categories":["user-stories","engineering-culture"],"date":"2018-01-09T00:00:00.000Z","title":"我是如何在四年时间里，从厨师转行为 Serverless 应用开发者","description":"我是厨师出身，现在成为了一名 Serverless 应用开发者。","authorslink":["https://serverless.com/author/kieranmccarthy/"],"translators":["Aceyclee"],"translatorslink":["https://www.zhihu.com/people/Aceyclee"],"tags":["应用开发","Serverless"],"keywords":"Serverless 应用开发,Serverless 管理,厨师转行为 Serverless 应用开发者","outdated":null},"wordCount":{"words":285,"sentences":38,"paragraphs":36},"fileAbsolutePath":"/opt/build/repo/content/blog/2018-01-09-from-chef-to-serverless-developer-in-4-years.md","fields":{"slug":"/blog/2018-01-09-from-chef-to-serverless-developer-in-4-years/","keywords":["无服务器","无服务器开发","云函数","学习","Serverless","构建","Framework","开发者","服务器","应用","学位","简历"]}}},{"node":{"id":"713a0563-4bf9-5721-bacb-3b4ef609fe4a","frontmatter":{"thumbnail":"https://s3-us-west-2.amazonaws.com/assets.blog.serverless.com/camp-fire/camp-fire-housing-thumb.jpg","authors":["EricWyne"],"categories":["guides-and-tutorials","user-stories"],"date":"2018-12-05T00:00:00.000Z","title":"Serverless Twitter 机器人帮助为坎普山火受灾者安置住房","description":"加利福尼亚州的坎普山火致使数千人流离失所，为此，我构建了一个简单的 Serverless Twitter 机器人来帮助将受灾者安置在临时住房！","authorslink":["https://serverless.com/author/ericwyne/"],"translators":["Aceyclee"],"translatorslink":["zhihu.com/people/Aceyclee"],"tags":null,"keywords":null,"outdated":null},"wordCount":{"words":157,"sentences":26,"paragraphs":26},"fileAbsolutePath":"/opt/build/repo/content/blog/2018-12-05-serverless-twitter-camp-fire.md","fields":{"slug":"/blog/2018-12-05-serverless-twitter-camp-fire/","keywords":["serverless","无服务器","云函数","Serverless","org","住房","Twitter","函数","受灾","机器人","山火"]}}},{"node":{"id":"98602143-b837-5f50-a24f-3b1ec76044d7","frontmatter":{"thumbnail":"https://s3-us-west-2.amazonaws.com/assets.blog.serverless.com/sqquid/sqquid-serverless-thumb.jpg","authors":["RonPeled"],"categories":["user-stories"],"date":"2018-12-17T00:00:00.000Z","title":"SQQUID：100% 无服务器初创公司","description":"SQQUID 将 AWS Lambda 和无服务器框架用于其核心产品和营销网站。我们来看看一个完全无服务器的初创公司是怎样的。","authorslink":null,"translators":null,"translatorslink":null,"tags":null,"keywords":null,"outdated":null},"wordCount":{"words":266,"sentences":42,"paragraphs":42},"fileAbsolutePath":"/opt/build/repo/content/blog/2018-12-17-sqquid-one-hundred-percent-serverless.md","fields":{"slug":"/blog/2018-12-17-sqquid-one-hundred-percent-serverless/","keywords":["go","serverless","无服务器","无服务器架构","服务器","架构","Lambda","集成","FaaS","串行","系统"]}}},{"node":{"id":"29dc2e58-d2ba-56f9-aee1-d21b0bc62e0e","frontmatter":{"thumbnail":"https://s3-us-west-2.amazonaws.com/assets.blog.serverless.com/ao-com-story/ao-serverless-thumbnail.png","authors":["NickGottlieb"],"categories":["user-stories"],"date":"2019-04-24T00:00:00.000Z","title":"AO.com：逐渐转向无服务器优先","description":"AO.com 的 SCV 团队率先尝试无服务器服务。折服于无服务器框架的快速周转时间和低维护成本，整个团队逐渐转向无服务器优先。","authorslink":null,"translators":null,"translatorslink":null,"tags":null,"keywords":null,"outdated":null},"wordCount":{"words":236,"sentences":42,"paragraphs":35},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-04-24-ao-serverless-first.md","fields":{"slug":"/blog/2019-04-24-ao-serverless-first/","keywords":["serverless","无服务器","服务器","团队","Lambda","功能","构建"]}}},{"node":{"id":"752d08d1-387a-5bde-acf3-98141baab294","frontmatter":{"thumbnail":"https://img.serverlesscloud.cn/2020414/1586871710979-%E5%85%AC%E5%85%B1%E7%94%A8.png","authors":["Anycodes"],"categories":["user-stories"],"date":"2019-06-20T00:00:00.000Z","title":"如何用 Serverless 为 Python 云函数打包依赖","description":"在使用无服务器云函数SCF时通常会遇到导入第三方库的问题，很多小伙伴比较头疼是：应该如何打包进去？这里，推荐几个不错的方法。","authorslink":["https://zhuanlan.zhihu.com/ServerlessGo"],"translators":null,"translatorslink":null,"tags":["云函数","Serverless"],"keywords":"Serverless,Serverless应用,无服务器云函数","outdated":null},"wordCount":{"words":81,"sentences":43,"paragraphs":43},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-06-20-for-python-cloud-functions.md","fields":{"slug":"/blog/2019-06-20-for-python-cloud-functions/","keywords":["java","serverless","无服务器","无服务器云函数","云函数","serverlesscloud","安装","serverless","pillowtest"]}}},{"node":{"id":"2dc78814-9d77-555b-a1bb-ad202c8ec2d1","frontmatter":{"thumbnail":"https://s3-us-west-2.amazonaws.com/assets.blog.serverless.com/cloudforecast/thumbnail.png","authors":["FrancoisLagier"],"categories":["user-stories"],"date":"2019-08-07T00:00:00.000Z","title":"Serverless：初创企业的理想选择？（CloudForecast 案例分析）","description":"CloudForecast 是 2018 年成立的一家独立初创企业，本文将介绍他们决定选择 Serverless 的原因。","authorslink":["https://serverless.com/author/francoislagier/"],"translators":["Aceyclee"],"translatorslink":["zhihu.com/people/Aceyclee"],"tags":null,"keywords":null,"outdated":null},"wordCount":{"words":211,"sentences":29,"paragraphs":29},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-08-07-serverless-for-startups.md","fields":{"slug":"/blog/2019-08-07-serverless-for-startups/","keywords":["serverless","云函数","serverless","函数","Serverless","utm","Framework","blog","CloudForecast","cloudforecast"]}}},{"node":{"id":"ae4fd2f8-515c-5aec-b584-38427ef33f7e","frontmatter":{"thumbnail":"https://img.serverlesscloud.cn/2020114/1578989800047-part-00492-780.jpg","authors":["Anycodes"],"categories":["guides-and-tutorials","user-stories"],"date":"2019-09-16T00:00:00.000Z","title":"突破传统 OJ 瓶颈，「判题姬」接入云函数","description":"通过 Serverless 实现在线编程","authorslink":["https://www.zhihu.com/people/liuyu-43-97"],"translators":null,"translatorslink":null,"tags":["在线编程","云函数"],"keywords":"Serverless 在线编程,Serverless OJ","outdated":null},"wordCount":{"words":169,"sentences":30,"paragraphs":30},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-09-16-online-Judge.md","fields":{"slug":"/blog/2019-09-16-online-Judge/","keywords":["python","serverless","云函数","代码","函数","serverless"]}}},{"node":{"id":"1a202c41-4b54-56ad-a5cf-55c0deabe542","frontmatter":{"thumbnail":"https://img.serverlesscloud.cn/20191227/1577413467740-v2-b65fcb6a94208a494005fc0c40a99eb6_1200x500.jpg","authors":["Aceyclee"],"categories":["news","user-stories"],"date":"2019-11-19T00:00:00.000Z","title":"荐书 | Serverless 架构：从原理、设计到项目实战","description":"安利一下 Serverless 中文技术社区成员 Anycodes 的大作","authorslink":["https://www.zhihu.com/people/Aceyclee"],"translators":null,"translatorslink":null,"tags":["Serverless"],"keywords":"Serverless 原理,Serverless 设计,Serverless 项目实战","outdated":null},"wordCount":{"words":119,"sentences":15,"paragraphs":15},"fileAbsolutePath":"/opt/build/repo/content/blog/2019-11-19-anycodes-book.md","fields":{"slug":"/blog/2019-11-19-anycodes-book/","keywords":["serverless","无服务器","Serverless","开发者","架构","云计算","click"]}}}],"totalCount":64}},"pageContext":{"isCreatedByStatefulCreatePages":false,"blogId":"97450b07-658b-5207-8216-1c7b9b51b115","previousBlogId":"ae4fd2f8-515c-5aec-b584-38427ef33f7e","nextBlogId":"a4c5d988-73c4-5ee7-84d9-02d548d13970","categories":["user-stories"]}}}