YP小站
闪射理想之光吧心灵之星!把光流注入未来的暮霭之中。
2022-10-30T12:00:57.717Z
https://www.yp14.cn/
Peng Yang
389736180@qq.com
Hexo
ETCD存储满了如何处理?
https://www.yp14.cn/2022/10/30/ETCD存储满了如何处理/
2022-10-30T11:59:40.000Z
2022-10-30T12:00:57.717Z
<h2 id="一、前言"><a href="#一、前言" class="headerlink" title="一、前言"></a>一、前言</h2><p>当运行 <code>ETCD</code> 日志报 <code>Erro: mvcc database space exceeded</code> 时,说明ETCD存储不足了(默认ETCD存储是2G),配额会触发告警,然后 Etcd 系统将进入操作受限的维护模式。</p>
<p>通过下面命令可以查看ETCD存储使用情况:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ ETCDCTL_API=3 etcdctl --endpoints=<span class="string">"http://127.0.0.1:2379"</span> --write-out=table endpoint status</span><br></pre></td></tr></table></figure>
<p><img src="https://cdm.yp14.cn/img1/etcd-11.png" alt=""></p>
<a id="more"></a>
<h2 id="二、临时解决方案"><a href="#二、临时解决方案" class="headerlink" title="二、临时解决方案"></a>二、临时解决方案</h2><blockquote>
<p>PS: 压缩前做好快照备份,命令 <code>etcdctl snapshot save backup.db</code></p>
</blockquote>
<p>通过 ETCD <code>数据压缩</code>来临时解决问题,具体如下操作</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 获取当前版本</span></span><br><span class="line">$ rev=$(ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 endpoint status --write-out=<span class="string">"json"</span> | egrep -o <span class="string">'"revision":[0-9]*'</span> | egrep -o <span class="string">'[0-9].*'</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 压缩所有旧版本</span></span><br><span class="line">$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 compact <span class="variable">$rev</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 整理多余的空间</span></span><br><span class="line">$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 defrag</span><br><span class="line"></span><br><span class="line"><span class="comment"># 取消告警信息</span></span><br><span class="line">$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 alarm disarm</span><br><span class="line"></span><br><span class="line"><span class="comment"># 测试是否能成功写入</span></span><br><span class="line">$ ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 put testkey 123</span><br><span class="line"></span><br><span class="line">OK</span><br><span class="line"></span><br><span class="line"><span class="comment"># 再次查看ETCD存储使用情况</span></span><br><span class="line">$ ETCDCTL_API=3 etcdctl --endpoints=<span class="string">"http://127.0.0.1:2379"</span> --write-out=table endpoint status</span><br></pre></td></tr></table></figure>
<h2 id="三、最终解决方案"><a href="#三、最终解决方案" class="headerlink" title="三、最终解决方案"></a>三、最终解决方案</h2><p>在 ETCD 启动命令中添加下面两个参数:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 表示每隔一个小时自动压缩一次</span></span><br><span class="line">--auto-compaction-retention=1</span><br><span class="line"><span class="comment"># 磁盘空间调整为 8G,官方建议最大 8G(单位是字节)</span></span><br><span class="line">--quota-backend-bytes=8388608000</span><br></pre></td></tr></table></figure>
<h2 id="四、最佳实践"><a href="#四、最佳实践" class="headerlink" title="四、最佳实践"></a>四、最佳实践</h2><p>大家有没有使用过 <code>Kuboard</code>(Kubernetes 多集群管理界面,官网地址:<a href="https://kuboard.cn),如果有使用过的同学可能会遇到ETCD存储不足的问题,因为官网提供的docker镜像中,ETCD启动参数并没有添加" target="_blank" rel="external">https://kuboard.cn),如果有使用过的同学可能会遇到ETCD存储不足的问题,因为官网提供的docker镜像中,ETCD启动参数并没有添加</a> <code>--auto-compaction-retention</code> 和 <code>--quota-backend-bytes</code> 参数。</p>
<p>修改官网 <code>Kuboard</code> docker镜像 <code>/entrypoint.sh</code> 启动脚本</p>
<p><img src="https://cdm.yp14.cn/img1/etcd-12.png" alt=""></p>
<p>生成 Dockerfile 文件:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 编辑 Dockerfile</span></span><br><span class="line">$ vim Dockerfile</span><br><span class="line"></span><br><span class="line">FROM eipwork/kuboard:v3.5.0.3</span><br><span class="line"></span><br><span class="line">COPY ./entrypoint.sh /entrypoint.sh</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建镜像</span></span><br><span class="line">$ docker build -t eipwork/kuboard-modify:v3.5.0.3 . <span class="_">-f</span> Dockerfile</span><br></pre></td></tr></table></figure>
<p>启动 Kuboard,并查看进程如下:</p>
<p><img src="https://cdm.yp14.cn/img1/etcd-13.png" alt=""></p>
<h2 id="五、参考文档"><a href="#五、参考文档" class="headerlink" title="五、参考文档"></a>五、参考文档</h2><ul>
<li><a href="https://etcd.io/docs/v3.4/op-guide/maintenance/" target="_blank" rel="external">https://etcd.io/docs/v3.4/op-guide/maintenance/</a></li>
</ul>
<h2 id="一、前言"><a href="#一、前言" class="headerlink" title="一、前言"></a>一、前言</h2><p>当运行 <code>ETCD</code> 日志报 <code>Erro: mvcc database space exceeded</code> 时,说明ETCD存储不足了(默认ETCD存储是2G),配额会触发告警,然后 Etcd 系统将进入操作受限的维护模式。</p>
<p>通过下面命令可以查看ETCD存储使用情况:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ ETCDCTL_API=3 etcdctl --endpoints=<span class="string">"http://127.0.0.1:2379"</span> --write-out=table endpoint status</span><br></pre></td></tr></table></figure>
<p><img src="https://cdm.yp14.cn/img1/etcd-11.png" alt=""></p>
业务日志告警如何做?
https://www.yp14.cn/2022/10/23/业务日志告警如何做/
2022-10-23T12:43:15.000Z
2022-10-23T12:48:48.361Z
<h2 id="一、前言"><a href="#一、前言" class="headerlink" title="一、前言"></a>一、前言</h2><p>随着 Kubernetes 使用越来越广泛,日志集中收集、展示、告警等都需要考虑的事情。Kubernetes 日志收集方案一般有下面几种:</p>
<ul>
<li>1、日志收集组件以 <code>Daemonset</code> 形式运行在 Kubernetes Node 中,业务容器日志目录统一挂载到Node节点指定的目录,日志收集组件读取对应的目录。</li>
<li>2、日志收集组件以 <code>Daemonset</code> 形式运行在 Kubernetes Node 中,收集业务容器标准输出<code>stdout</code>和<code>stderr</code>日志。</li>
<li>3、日志收集组件以 <code>Sidecar</code> 形式和业务容器运行在一个pod中,把业务日志目录挂载出来,让同一个Pod中日志收集容器能读取到。</li>
</ul>
<blockquote>
<p>日志收集到集中日志平台,但是另一个问题来了,应该如何对业务日志告警?</p>
</blockquote>
<p>下面是一个 Kubernetes 日志收集架构图,比较开源的解决方案。</p>
<p><img src="https://cdm.yp14.cn/img1/elk-10.png" alt=""></p>
<a id="more"></a>
<h2 id="二、日志格式"><a href="#二、日志格式" class="headerlink" title="二、日志格式"></a>二、日志格式</h2><p>下面例举两个日志例子,一个是 Nginx 访问日志,另一个是 Java 业务日志格式</p>
<ul>
<li>1、nginx 访问日志格式</li>
</ul>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> "@timestamp": "2022-10-20T11:47:05+08:00",</span><br><span class="line"> "servername": "www.example.com",</span><br><span class="line"> "remote_addr": "172.20.199.10",</span><br><span class="line"> "referer": "-",</span><br><span class="line"> "request_method": "GET",</span><br><span class="line"> "request_uri": "/",</span><br><span class="line"> "server_protocol": "HTTP/1.1",</span><br><span class="line"> "request_time": "0.000",</span><br><span class="line"> "status": 200,</span><br><span class="line"> "bytes": 577,</span><br><span class="line"> "useragent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36",</span><br><span class="line"> "x_forwarded": "172.18.25.11, 100.122.43.140",</span><br><span class="line"> "upstr_addr": "172.20.199.20:8080",</span><br><span class="line"> "upstr_host": "-",</span><br><span class="line"> "ups_resp_time": "0.01"</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<ul>
<li>2、java 业务日志格式<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> "@timestamp": "2022-10-20T19:14:26.875+08:00",</span><br><span class="line"> "level": "INFO",</span><br><span class="line"> "appName": "test-service",</span><br><span class="line"> "requestId": "",</span><br><span class="line"> "remoteIp": "",</span><br><span class="line"> "traceId": "",</span><br><span class="line"> "spanId": "",</span><br><span class="line"> "parent": "",</span><br><span class="line"> "thread": "XNIO-1 task-14",</span><br><span class="line"> "class": "c.c.common.security.util.SecurityUtil",</span><br><span class="line"> "line": "118",</span><br><span class="line"> "message": "没有登录",</span><br><span class="line"> "stack_trace": "java.lang.ClassCastException: null\n"</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
</li>
</ul>
<h2 id="三、告警要求"><a href="#三、告警要求" class="headerlink" title="三、告警要求"></a>三、告警要求</h2><ul>
<li>1、Nginx 访问日志,1分钟内,需要把 Http 状态码是 <code>404、429、499、5xx</code> 大于10条就告警</li>
<li>2、java 业务日志格式,1分钟内,日志级别(level)是 <code>ERROR</code> 并且总数大于10条就告警</li>
<li>3、通过 <code>钉钉机器人</code> 或者 <code>飞书机器人</code> 告警</li>
</ul>
<h2 id="四、如何根据日志告警?"><a href="#四、如何根据日志告警?" class="headerlink" title="四、如何根据日志告警?"></a>四、如何根据日志告警?</h2><blockquote>
<p>本文日志存储在 <code>Elasticsearch</code> 中</p>
</blockquote>
<p>本文使用 <code>ElastAlert</code> 服务来实现告警,简单介绍下 ElastAlert 是什么?</p>
<p><code>ElastAlert</code> 是一个简单的框架,用于从检索Elasticsearch中的数据异常,尖峰等来实现告警。</p>
<p>它通过将Elasticsearch与两种类型的组件(规则类型和警报)结合使用。定期查询Elasticsearch,并将数据传递到规则类型,该规则类型确定找到任何匹配项。发生匹配时,它会发出一个或多个警报,这些警报根据不同的类型采取相应的措施。</p>
<p>ElastAlert由一组规则配置,每个规则定义一个查询,一个规则类型和一组警报。</p>
<h3 id="ElastAlert-特性"><a href="#ElastAlert-特性" class="headerlink" title="ElastAlert 特性"></a>ElastAlert 特性</h3><ul>
<li>架构简单,定制灵活</li>
<li>支持多种匹配规则(频率、阈值、数据变化、黑白名单、变化率等)</li>
<li>支持多种警报类型(邮件、HTTP POST、自定义脚本等,<code>默认不支持 钉钉机器人、飞书机器人等</code>)</li>
<li>匹配项汇总报警,重复警报抑制,报警失败重试和过期</li>
<li>可用性强,状态信息保存到Elasticsearch的索引中</li>
<li>过程的调试和审计等</li>
</ul>
<h3 id="ElastAlert-可用性"><a href="#ElastAlert-可用性" class="headerlink" title="ElastAlert 可用性"></a>ElastAlert 可用性</h3><ul>
<li>ElastAlert 将其状态保存到 Elasticsearch,启动后,将恢复之前停止的状态</li>
<li>如果 Elasticsearch 没有响应,ElastAlert 将等到恢复后才继续</li>
<li>抛出错误的警报可能会在一段时间内自动重试</li>
</ul>
<h3 id="ElastAlert-部署"><a href="#ElastAlert-部署" class="headerlink" title="ElastAlert 部署"></a>ElastAlert 部署</h3><blockquote>
<p>项目地址:<a href="https://github.com/bitsensor/elastalert.git,官方提供" target="_blank" rel="external">https://github.com/bitsensor/elastalert.git,官方提供</a> docker 镜像,但并不是很好用,这里作者重新构建一个docker镜像。</p>
</blockquote>
<h3 id="ElastAlert-告警监控展示"><a href="#ElastAlert-告警监控展示" class="headerlink" title="ElastAlert 告警监控展示"></a>ElastAlert 告警监控展示</h3><ul>
<li>1、钉钉机器人告警展示</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/elk-13.png" alt=""></p>
<ul>
<li>2、飞书机器人告警展示</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/elk-12.png" alt=""></p>
<h3 id="构建-elastalert-镜像"><a href="#构建-elastalert-镜像" class="headerlink" title="构建 elastalert 镜像"></a>构建 elastalert 镜像</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 下载 ElastAlert 代码</span></span><br><span class="line">$ git <span class="built_in">clone</span> https://github.com/bitsensor/elastalert.git</span><br><span class="line">$ <span class="built_in">cd</span> elastalert</span><br><span class="line"></span><br><span class="line"><span class="comment"># 创建 Dockerfile,ElastAlert 默认不支持 钉钉机器人、飞书机器人,这里需要扩展下</span></span><br><span class="line">$ vim Dockerfile</span><br><span class="line"></span><br><span class="line">FROM python:3.6-alpine as pyea</span><br><span class="line"></span><br><span class="line">ENV ELASTALERT_VERSION=v0.2.4</span><br><span class="line">ENV ELASTALERT_URL=https://github.com/Yelp/elastalert/archive/<span class="variable">$ELASTALERT_VERSION</span>.zip</span><br><span class="line">ENV ELASTALERT_HOME /opt/elastalert</span><br><span class="line"></span><br><span class="line">WORKDIR /opt</span><br><span class="line"></span><br><span class="line">RUN sed -i <span class="string">'s/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g'</span> /etc/apk/repositories && \</span><br><span class="line"> apk add --update --no-cache ca-certificates openssl-dev openssl libffi-dev gcc musl-dev wget && \</span><br><span class="line"> apk add --update --no-cache curl tzdata make libmagic nodejs npm && \</span><br><span class="line"> apk add --update --no-cache tzdata && \</span><br><span class="line"> cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && \</span><br><span class="line"> <span class="built_in">echo</span> <span class="string">"Asia/Shanghai"</span> > /etc/timezone && \</span><br><span class="line"> wget -O elastalert.zip <span class="string">"<span class="variable">${ELASTALERT_URL}</span>"</span> && \</span><br><span class="line"> unzip elastalert.zip && \</span><br><span class="line"> rm elastalert.zip && \</span><br><span class="line"> mv e* <span class="string">"<span class="variable">${ELASTALERT_HOME}</span>"</span></span><br><span class="line"></span><br><span class="line">ENV TZ Asia/Shanghai</span><br><span class="line"></span><br><span class="line">WORKDIR <span class="string">"<span class="variable">${ELASTALERT_HOME}</span>"</span></span><br><span class="line"></span><br><span class="line">RUN mkdir ~/.pip && \</span><br><span class="line"> <span class="built_in">echo</span> <span class="string">'[global]'</span> >> ~/.pip/pip.conf && \</span><br><span class="line"> <span class="built_in">echo</span> <span class="string">'index-url = https://pypi.tuna.tsinghua.edu.cn/simple'</span> >> ~/.pip/pip.conf && \</span><br><span class="line"> /usr/<span class="built_in">local</span>/bin/python3 -m pip install --upgrade pip && \</span><br><span class="line"> pip3 install cryptography==3.3.2 && \</span><br><span class="line"> sed -i <span class="string">'s/jira>=1.0.10,<1.0.15/jira>=2.0.0/g'</span> setup.py && \</span><br><span class="line"> python3 setup.py install && \</span><br><span class="line"> pip3 install <span class="string">"setuptools==46.1.3"</span> && \</span><br><span class="line"> pip3 install pyOpenSSL==16.2.0 && \</span><br><span class="line"> sed -i <span class="string">'s/jira>=1.0.10,<1.0.15/jira>=2.0.0/g'</span> requirements.txt && \</span><br><span class="line"> pip3 install -r requirements.txt</span><br><span class="line"></span><br><span class="line">RUN <span class="built_in">cd</span> /opt && \</span><br><span class="line"> wget https://github.com/xuyaoqiang/elastalert-dingtalk-plugin/archive/master.zip && \</span><br><span class="line"> unzip master.zip && \</span><br><span class="line"> rm <span class="_">-f</span> master.zip && \</span><br><span class="line"> <span class="built_in">cd</span> elastalert-dingtalk-plugin-master && \</span><br><span class="line"> cp -r elastalert_modules /opt/elastalert/ </span><br><span class="line"></span><br><span class="line">COPY . /opt/elastalert-server</span><br><span class="line"></span><br><span class="line">WORKDIR /opt/elastalert-server</span><br><span class="line"></span><br><span class="line">RUN sed -i <span class="string">'1i process.env.TZ = "Asia/Shanghai";'</span> index.js && \</span><br><span class="line"> npm --registry https://registry.npm.taobao.org install --production --quiet</span><br><span class="line"></span><br><span class="line">COPY config/elastalert.yaml /opt/elastalert/config.yaml</span><br><span class="line">COPY config/elastalert-test.yaml /opt/elastalert/config-test.yaml</span><br><span class="line">COPY config/config.json config/config.json</span><br><span class="line">COPY rule_templates/ /opt/elastalert/rule_templates</span><br><span class="line">COPY elastalert_modules/ /opt/elastalert/elastalert_modules</span><br><span class="line"></span><br><span class="line">RUN mkdir -p /opt/elastalert/rules/ /opt/elastalert/server_data/tests/ && \</span><br><span class="line"> <span class="built_in">cd</span> /opt/elastalert/elastalert_modules && \</span><br><span class="line"> wget https://raw.githubusercontent.com/gpYang/elastalert-feishu-plugin/main/elastalert_modules/feishu_alert.py</span><br><span class="line"></span><br><span class="line">EXPOSE 3030</span><br><span class="line">ENTRYPOINT [<span class="string">"npm"</span>, <span class="string">"start"</span>]</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建 ElastAlert 镜像</span></span><br><span class="line">$ docker build -t yangpeng2468/elastalert:v0.2.4 . <span class="_">-f</span> Dockerfile</span><br></pre></td></tr></table></figure>
<p>上传到 dockerhub 个人仓库中,提供给大家下载<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ docker push yangpeng2468/elastalert:v0.2.4</span><br></pre></td></tr></table></figure></p>
<h3 id="配置-ElastAlert-并启动"><a href="#配置-ElastAlert-并启动" class="headerlink" title="配置 ElastAlert 并启动"></a>配置 ElastAlert 并启动</h3><ul>
<li>1、首先创建 config.json 配置文件</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 创建配置文件目录</span></span><br><span class="line">$ mkdir -p /data/elastalert/config /data/elastalert/rules /data/elastalert/rule_templates</span><br><span class="line"></span><br><span class="line">$ vim config/config.json</span><br><span class="line"></span><br><span class="line">{</span><br><span class="line"> <span class="string">"appName"</span>: <span class="string">"elastalert-server"</span>,</span><br><span class="line"> <span class="string">"port"</span>: 3030,</span><br><span class="line"> <span class="string">"wsport"</span>: 3333,</span><br><span class="line"> <span class="string">"elastalertPath"</span>: <span class="string">"/opt/elastalert"</span>,</span><br><span class="line"> <span class="string">"verbose"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="string">"es_debug"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="string">"debug"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="string">"rulesPath"</span>: {</span><br><span class="line"> <span class="string">"relative"</span>: <span class="literal">true</span>,</span><br><span class="line"> <span class="string">"path"</span>: <span class="string">"/rules"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="string">"templatesPath"</span>: {</span><br><span class="line"> <span class="string">"relative"</span>: <span class="literal">true</span>,</span><br><span class="line"> <span class="string">"path"</span>: <span class="string">"/rule_templates"</span></span><br><span class="line"> },</span><br><span class="line"> <span class="string">"es_host"</span>: <span class="string">"xx.xx.xx.xx"</span>, <span class="comment"># es 地址</span></span><br><span class="line"> <span class="string">"es_port"</span>: 9200,</span><br><span class="line"> <span class="string">"writeback_index"</span>: <span class="string">"elastalert_status"</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<ul>
<li>2、创建 elastalert.yaml 配置文件</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line">$ vim config/elastalert.yaml</span><br><span class="line"></span><br><span class="line"><span class="comment"># The elasticsearch hostname for metadata writeback</span></span><br><span class="line"><span class="comment"># Note that every rule can have its own elasticsearch host</span></span><br><span class="line">es_host: xx.xx.xx.xx <span class="comment"># es 地址</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># The elasticsearch port</span></span><br><span class="line">es_port: 9200</span><br><span class="line"></span><br><span class="line"><span class="comment"># ElastAlert从中加载规则配置文件的位置 </span></span><br><span class="line">rules_folder: rules</span><br><span class="line"></span><br><span class="line"><span class="comment"># ElastAlert多久查询一次Elasticsearch的时间 </span></span><br><span class="line">run_every:</span><br><span class="line"> minutes: 10</span><br><span class="line"></span><br><span class="line"><span class="comment"># 用来设置请求里时间字段的范围,默认是15分钟 </span></span><br><span class="line">buffer_time:</span><br><span class="line"> minutes: 10</span><br><span class="line"></span><br><span class="line"><span class="comment"># Option basic-auth username and password for elasticsearch</span></span><br><span class="line">es_username: elastic </span><br><span class="line">es_password: xxx <span class="comment"># es 密码</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 是ElastAlert将在其中存储数据的索引的名称 </span></span><br><span class="line">writeback_index: elastalert_status</span><br><span class="line"></span><br><span class="line"><span class="comment"># 别名</span></span><br><span class="line">writeback_<span class="built_in">alias</span>: elastalert_alerts</span><br><span class="line"></span><br><span class="line"><span class="comment"># 失败警报的重试窗口 </span></span><br><span class="line">alert_time_<span class="built_in">limit</span>:</span><br><span class="line"> days: 2</span><br></pre></td></tr></table></figure>
<ul>
<li>3、创建 nginx 访问日志报警文件</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br></pre></td><td class="code"><pre><span class="line">$ vim rules/nginx.yaml</span><br><span class="line"></span><br><span class="line"><span class="comment">#rule name 必须是独一的,不然会报错,这个定义完成之后,会成为报警的标题</span></span><br><span class="line">name: nginx-access-alert</span><br><span class="line"></span><br><span class="line"><span class="comment">#配置的是frequency,需要两个条件满足,在相同 query_key条件下,timeframe 范围内有num_events个被过滤出来的异常</span></span><br><span class="line"><span class="built_in">type</span>: frequency</span><br><span class="line"></span><br><span class="line"><span class="comment">#指定index,支持正则匹配同时如果嫌麻烦直接* 也可</span></span><br><span class="line">index: nginx-*-prod-%Y-%m-%d </span><br><span class="line">use_strftime_index: <span class="literal">true</span></span><br><span class="line"></span><br><span class="line"><span class="comment">#时间触发的次数</span></span><br><span class="line">num_events: 10</span><br><span class="line"></span><br><span class="line"><span class="comment">#和num_events参数关联,也就是说1分钟内出现10次会报警</span></span><br><span class="line">timeframe:</span><br><span class="line"> minutes: 1</span><br><span class="line"></span><br><span class="line"><span class="comment">#同一规则的两次警报之间的最短时间。在此时间内发生的任何警报都将被丢弃。默认值为一分钟。</span></span><br><span class="line">realert:</span><br><span class="line"> minutes: 3</span><br><span class="line"></span><br><span class="line"><span class="comment">#防止同一条规则在一段时间内发出两次警报</span></span><br><span class="line"><span class="comment">#realert:</span></span><br><span class="line"><span class="comment"># days: 1</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># query_key 用来防止基于某个字段的重复项</span></span><br><span class="line">realert:</span><br><span class="line"> minutes: 10</span><br><span class="line">query_key: servername</span><br><span class="line"></span><br><span class="line"><span class="comment">#用来拼配告警规则,elasticsearch 的query语句,支持 AND&OR等</span></span><br><span class="line">filter:</span><br><span class="line">- query:</span><br><span class="line"> query_string: </span><br><span class="line"> query: <span class="string">"status: [500 TO 599] OR status: 429 OR status: 404"</span></span><br><span class="line"></span><br><span class="line"><span class="comment">#只需要的字段 https://elastalert.readthedocs.io/en/latest/ruletypes.html#include</span></span><br><span class="line">include: [<span class="string">"servername"</span>, <span class="string">"request_method"</span>, <span class="string">"request_uri"</span>, <span class="string">"remote_addr"</span>, <span class="string">"@timestamp"</span>, <span class="string">"status"</span>, <span class="string">"request_time"</span>, <span class="string">"ups_resp_time"</span>, <span class="string">"x_forwarded"</span>, <span class="string">"upstr_addr"</span>]</span><br><span class="line"></span><br><span class="line"><span class="comment">#告警方式,钉钉 和 飞书 告警,可以只选择一种就行</span></span><br><span class="line">alert:</span><br><span class="line">- <span class="string">"elastalert_modules.dingtalk_alert.DingTalkAlerter"</span></span><br><span class="line">- <span class="string">"elastalert_modules.feishu_alert.FeishuAlert"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 钉钉机器人接口地址</span></span><br><span class="line">dingtalk_webhook: <span class="string">"https://oapi.dingtalk.com/robot/send?access_token=xxx"</span></span><br><span class="line">dingtalk_msgtype: <span class="string">"text"</span></span><br><span class="line">alert_subject: <span class="string">"Nginx访问日志异常"</span></span><br><span class="line">alert_text_<span class="built_in">type</span>: alert_text_only</span><br><span class="line">alert_text: |</span><br><span class="line"> 【告警主题】 Nginx访问日志异常</span><br><span class="line"> 【告警条件】 异常访问日志1分钟内大于10次</span><br><span class="line"> 【告警时间(UTC)】 {}</span><br><span class="line"> 【告警域名】 {}</span><br><span class="line"> 【状态码】 {}</span><br><span class="line"> 【请求URL】 {}</span><br><span class="line"> 【请求协议】 {}</span><br><span class="line"> 【客户端IP】 {}</span><br><span class="line"> 【响应时间】 {}</span><br><span class="line"> 【后端响应时间】 {}</span><br><span class="line"> 【后端请求主机】 {}</span><br><span class="line"> 【异常状态码数量】 {}</span><br><span class="line">alert_text_args:</span><br><span class="line"> - <span class="string">"@timestamp"</span></span><br><span class="line"> - servername</span><br><span class="line"> - status</span><br><span class="line"> - request_uri</span><br><span class="line"> - request_method</span><br><span class="line"> - x_forwarded</span><br><span class="line"> - request_time</span><br><span class="line"> - ups_resp_time</span><br><span class="line"> - upstr_addr</span><br><span class="line"> - num_hits</span><br><span class="line"></span><br><span class="line"><span class="comment"># 飞书机器人接口地址</span></span><br><span class="line">feishualert_url: <span class="string">"https://open.feishu.cn/open-apis/bot/v2/hook/"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 飞书机器人id</span></span><br><span class="line">feishualert_botid:</span><br><span class="line"> <span class="string">"xxx"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 告警标题</span></span><br><span class="line">feishualert_title:</span><br><span class="line"> <span class="string">"Nginx访问日志异常"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 这个时间段内的匹配将不告警,适用于某些时间段请求低谷避免误报警</span></span><br><span class="line">feishualert_skip:</span><br><span class="line"> start: <span class="string">"00:00:00"</span></span><br><span class="line"> end: <span class="string">"00:01:00"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 告警内容</span></span><br><span class="line"><span class="comment"># 使用{}可匹配matches</span></span><br><span class="line">feishualert_body:</span><br><span class="line"> <span class="string">"</span><br><span class="line"> 【告警主题】: {feishualert_title}\n</span><br><span class="line"> 【告警条件】: 异常访问日志1分钟内大于10次\n</span><br><span class="line"> 【告警时间】: {feishualert_time}\n</span><br><span class="line"> 【告警域名】: {servername}\n</span><br><span class="line"> 【状态码】: {status}\n</span><br><span class="line"> 【请求URL】: {request_uri}\n</span><br><span class="line"> 【请求协议】: {request_method}\n</span><br><span class="line"> 【客户端IP】: {x_forwarded}\n</span><br><span class="line"> 【响应时间】: {request_time}\n</span><br><span class="line"> 【后端响应时间】: {ups_resp_time}\n</span><br><span class="line"> 【后端请求主机】: {upstr_addr}\n</span><br><span class="line"> 【异常状态码数量】: {num_hits}</span><br><span class="line"> "</span></span><br></pre></td></tr></table></figure>
<ul>
<li>4、创建 java 业务日志报警文件</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br></pre></td><td class="code"><pre><span class="line">$ vim rules/java.yaml</span><br><span class="line"></span><br><span class="line"><span class="comment">#rule name 必须是独一的,不然会报错,这个定义完成之后,会成为报警的标题</span></span><br><span class="line">name: java-prod-alert</span><br><span class="line"></span><br><span class="line"><span class="comment">#配置的是frequency,需要两个条件满足,在相同 query_key条件下,timeframe 范围内有num_events个被过滤出来的异常</span></span><br><span class="line"><span class="built_in">type</span>: frequency</span><br><span class="line"></span><br><span class="line"><span class="comment">#指定index,支持正则匹配同时如果嫌麻烦直接* 也可</span></span><br><span class="line">index: java-*-prod-%Y-%m-%d</span><br><span class="line">use_strftime_index: <span class="literal">true</span></span><br><span class="line"></span><br><span class="line"><span class="comment">#时间触发的次数</span></span><br><span class="line">num_events: 10</span><br><span class="line"></span><br><span class="line"><span class="comment">#和num_events参数关联,也就是说1分钟内出现10次会报警</span></span><br><span class="line">timeframe:</span><br><span class="line"> minutes: 1</span><br><span class="line"></span><br><span class="line"><span class="comment">#同一规则的两次警报之间的最短时间。在此时间内发生的任何警报都将被丢弃。默认值为一分钟。</span></span><br><span class="line">realert:</span><br><span class="line"> minutes: 3</span><br><span class="line"></span><br><span class="line"><span class="comment">#防止同一条规则在一段时间内发出两次警报</span></span><br><span class="line"><span class="comment">#realert:</span></span><br><span class="line"><span class="comment"># days: 1</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># query_key 用来防止基于某个字段的重复项</span></span><br><span class="line">realert:</span><br><span class="line"> minutes: 15</span><br><span class="line">query_key: applicationName</span><br><span class="line"></span><br><span class="line"><span class="comment">#用来拼配告警规则,elasticsearch 的query语句,支持 AND&OR等</span></span><br><span class="line">filter:</span><br><span class="line">- query:</span><br><span class="line"> query_string: </span><br><span class="line"> query: <span class="string">"level: ERROR"</span></span><br><span class="line"></span><br><span class="line"><span class="comment">#只需要的字段 https://elastalert.readthedocs.io/en/latest/ruletypes.html#include</span></span><br><span class="line">include: [<span class="string">"applicationName"</span>, <span class="string">"level"</span>, <span class="string">"@timestamp"</span>, <span class="string">"_index"</span>]</span><br><span class="line"></span><br><span class="line"><span class="comment">#告警方式,钉钉 和 飞书 告警,可以只选择一种就行</span></span><br><span class="line">alert:</span><br><span class="line">- <span class="string">"elastalert_modules.dingtalk_alert.DingTalkAlerter"</span></span><br><span class="line">- <span class="string">"elastalert_modules.feishu_alert.FeishuAlert"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 钉钉机器人接口地址</span></span><br><span class="line">dingtalk_webhook: <span class="string">"https://oapi.dingtalk.com/robot/send?access_token=xxx"</span></span><br><span class="line">dingtalk_msgtype: <span class="string">"text"</span></span><br><span class="line">alert_subject: <span class="string">"java业务日志异常"</span></span><br><span class="line">alert_text_<span class="built_in">type</span>: alert_text_only</span><br><span class="line">alert_text: |</span><br><span class="line"> 【告警主题】 java业务日志异常</span><br><span class="line"> 【告警条件】 异常业务日志1分钟内大于10次</span><br><span class="line"> 【告警时间(UTC)】 {}</span><br><span class="line"> 【告警业务名称】 {}</span><br><span class="line"> 【告警业务索引】 {}</span><br><span class="line"> 【告警日志级别】 {}</span><br><span class="line"> 【错误日志数量】 {}</span><br><span class="line">alert_text_args:</span><br><span class="line"> - <span class="string">"@timestamp"</span></span><br><span class="line"> - applicationName</span><br><span class="line"> - _index</span><br><span class="line"> - level</span><br><span class="line"> - num_hits</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment"># 飞书机器人接口地址</span></span><br><span class="line">feishualert_url: <span class="string">"https://open.feishu.cn/open-apis/bot/v2/hook/"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 飞书机器人id</span></span><br><span class="line">feishualert_botid:</span><br><span class="line"> <span class="string">"xxx"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 告警标题</span></span><br><span class="line">feishualert_title:</span><br><span class="line"> <span class="string">"toB业务日志异常"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 这个时间段内的匹配将不告警,适用于某些时间段请求低谷避免误报警</span></span><br><span class="line">feishualert_skip:</span><br><span class="line"> start: <span class="string">"00:00:00"</span></span><br><span class="line"> end: <span class="string">"00:01:00"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 告警内容</span></span><br><span class="line"><span class="comment"># 使用{}可匹配matches</span></span><br><span class="line">feishualert_body:</span><br><span class="line"> <span class="string">"</span><br><span class="line"> 【告警主题】: {feishualert_title}\n</span><br><span class="line"> 【告警条件】: 异常业务日志1分钟内大于10次\n</span><br><span class="line"> 【告警时间】: {feishualert_time}\n</span><br><span class="line"> 【告警业务名称】: {applicationName}\n</span><br><span class="line"> 【警业务索引】: {_index}\n</span><br><span class="line"> 【告警日志级别】: {level}\n</span><br><span class="line"> 【错误日志数量】: {num_hits}</span><br><span class="line"> "</span></span><br></pre></td></tr></table></figure>
<ul>
<li>5、创建 ElastAlert 镜像启动脚本</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">cd</span> /data/elastalert</span><br><span class="line">$ vim run.sh</span><br><span class="line"></span><br><span class="line">docker run <span class="_">-d</span> --restart always -p 3030:3030 -p 3333:3333 \</span><br><span class="line"> --name elastalert \</span><br><span class="line"> --hostname elastalert \</span><br><span class="line"> -v /data/elastalert/config/elastalert.yaml:/opt/elastalert/config.yaml \</span><br><span class="line"> -v /data/elastalert/config/elastalert-test.yaml:/opt/elastalert/config-test.yaml \</span><br><span class="line"> -v /data/elastalert/config/config.json:/opt/elastalert-server/config/config.json \</span><br><span class="line"> -v /data/elastalert/rules:/opt/elastalert/rules \</span><br><span class="line"> -v /data/elastalert/rule_templates:/opt/elastalert/rule_templates \</span><br><span class="line"> yangpeng2468/elastalert:v0.2.4</span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动</span></span><br><span class="line">$ sh run.sh</span><br></pre></td></tr></table></figure>
<h2 id="五、参考文档"><a href="#五、参考文档" class="headerlink" title="五、参考文档"></a>五、参考文档</h2><ul>
<li><a href="https://github.com/bitsensor/elastalert" target="_blank" rel="external">https://github.com/bitsensor/elastalert</a></li>
<li><a href="https://zhuanlan.zhihu.com/p/386722918" target="_blank" rel="external">https://zhuanlan.zhihu.com/p/386722918</a></li>
</ul>
<h2 id="一、前言"><a href="#一、前言" class="headerlink" title="一、前言"></a>一、前言</h2><p>随着 Kubernetes 使用越来越广泛,日志集中收集、展示、告警等都需要考虑的事情。Kubernetes 日志收集方案一般有下面几种:</p>
<ul>
<li>1、日志收集组件以 <code>Daemonset</code> 形式运行在 Kubernetes Node 中,业务容器日志目录统一挂载到Node节点指定的目录,日志收集组件读取对应的目录。</li>
<li>2、日志收集组件以 <code>Daemonset</code> 形式运行在 Kubernetes Node 中,收集业务容器标准输出<code>stdout</code>和<code>stderr</code>日志。</li>
<li>3、日志收集组件以 <code>Sidecar</code> 形式和业务容器运行在一个pod中,把业务日志目录挂载出来,让同一个Pod中日志收集容器能读取到。</li>
</ul>
<blockquote>
<p>日志收集到集中日志平台,但是另一个问题来了,应该如何对业务日志告警?</p>
</blockquote>
<p>下面是一个 Kubernetes 日志收集架构图,比较开源的解决方案。</p>
<p><img src="https://cdm.yp14.cn/img1/elk-10.png" alt=""></p>
详解Nginx proxy_pass 使用
https://www.yp14.cn/2022/03/27/详解Nginx-proxy-pass-使用/
2022-03-27T11:30:17.000Z
2022-03-27T11:31:04.124Z
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>日常不管是研发还是运维,都多少会使用<code>Nginx</code>服务,很多情况Nginx用于反向代理,那就离不开使用<code>proxy_pass</code>,有些同学会对 <code>proxy_pass</code> 转发代理时 <code>后面url加 /</code>、<code>后面url没有 /</code>、<code>后面url添加其它路由</code>等场景,不能很明白其中的意思,下面来聊聊这些分别代表什么意思。</p>
<a id="more"></a>
<h2 id="详解"><a href="#详解" class="headerlink" title="详解"></a>详解</h2><p>客户端请求 URL <code>https://172.16.1.1/hello/world.html</code></p>
<h3 id="第一种场景-后面url加"><a href="#第一种场景-后面url加" class="headerlink" title="第一种场景 后面url加 /"></a>第一种场景 后面url加 /</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">location /hello/ {</span><br><span class="line"> proxy_pass http://127.0.0.1/;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p><code>结果</code>:代理到URL:<a href="http://127.0.0.1/world.html" target="_blank" rel="external">http://127.0.0.1/world.html</a></p>
<h3 id="第二种场景-后面url没有"><a href="#第二种场景-后面url没有" class="headerlink" title="第二种场景 后面url没有 /"></a>第二种场景 后面url没有 /</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">location /hello/ {</span><br><span class="line"> proxy_pass http://127.0.0.1;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p><code>结果</code>:代理到URL:<a href="http://127.0.0.1/hello/world.html" target="_blank" rel="external">http://127.0.0.1/hello/world.html</a></p>
<h3 id="第三种场景-后面url添加其它路由,并且最后添加"><a href="#第三种场景-后面url添加其它路由,并且最后添加" class="headerlink" title="第三种场景 后面url添加其它路由,并且最后添加 /"></a>第三种场景 后面url添加其它路由,并且最后添加 /</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">location /hello/ {</span><br><span class="line"> proxy_pass http://127.0.0.1/<span class="built_in">test</span>/;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p><code>结果</code>:代理到URL:<a href="http://127.0.0.1/test/world.html" target="_blank" rel="external">http://127.0.0.1/test/world.html</a></p>
<h3 id="第四种场景-后面url添加其它路由,但最后没有添加"><a href="#第四种场景-后面url添加其它路由,但最后没有添加" class="headerlink" title="第四种场景 后面url添加其它路由,但最后没有添加 /"></a>第四种场景 后面url添加其它路由,但最后没有添加 /</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">location /hello/ {</span><br><span class="line"> proxy_pass http://127.0.0.1/<span class="built_in">test</span>;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p><code>结果</code>:代理到URL:<a href="http://127.0.0.1/testworld.html" target="_blank" rel="external">http://127.0.0.1/testworld.html</a></p>
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>日常不管是研发还是运维,都多少会使用<code>Nginx</code>服务,很多情况Nginx用于反向代理,那就离不开使用<code>proxy_pass</code>,有些同学会对 <code>proxy_pass</code> 转发代理时 <code>后面url加 /</code>、<code>后面url没有 /</code>、<code>后面url添加其它路由</code>等场景,不能很明白其中的意思,下面来聊聊这些分别代表什么意思。</p>
Docker与Containerd使用区别
https://www.yp14.cn/2022/03/20/Docker与Containerd使用区别/
2022-03-20T10:55:05.000Z
2022-03-20T10:56:43.261Z
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p><code>Kubernetes</code> 在 <code>1.24</code> 版本里弃用并移除 <code>docker shim</code>,这导致 <code>1.24</code> 版本开始不在支持 <code>docker</code> 运行时。大部分用户会选择使用 <code>Containerd</code> 做为Kubernetes运行时。</p>
<blockquote>
<p>PS: <code>docker-ce</code> 底层就是 <code>Containerd</code></p>
</blockquote>
<p>使用 Containerd 时,kubelet 不需要通过 <code>docker shim</code> 调用,直接通过 <code>Container Runtime Interface (CRI)</code> 与容器运行时交互。减少调用层,并且也减少很多bug产生。</p>
<p>下面来讲讲 docker 与 Containerd 使用有那些方面不同。</p>
<a id="more"></a>
<h2 id="Docker-和-Containerd-常用命令比较"><a href="#Docker-和-Containerd-常用命令比较" class="headerlink" title="Docker 和 Containerd 常用命令比较"></a>Docker 和 Containerd 常用命令比较</h2><table>
<thead>
<tr>
<th>镜像相关操作</th>
<th>Docker</th>
<th>Containerd</th>
</tr>
</thead>
<tbody>
<tr>
<td>显示本地镜像列表</td>
<td>docker images</td>
<td>crictl images</td>
</tr>
<tr>
<td>下载镜像</td>
<td>docker pull</td>
<td>crictl pull</td>
</tr>
<tr>
<td>上传镜像</td>
<td>docker push</td>
<td>无</td>
</tr>
<tr>
<td>删除本地镜像</td>
<td>docker rmi</td>
<td>crictl rmi</td>
</tr>
<tr>
<td>查看镜像详情</td>
<td>docker inspect IMAGE-ID</td>
<td>crictl inspect IMAGE-ID</td>
</tr>
</tbody>
</table>
<table>
<thead>
<tr>
<th>容器相关操作</th>
<th>Docker</th>
<th>Containerd</th>
</tr>
</thead>
<tbody>
<tr>
<td>显示容器列表</td>
<td>docker ps</td>
<td>crictl ps</td>
</tr>
<tr>
<td>创建容器</td>
<td>docker create</td>
<td>crictl create</td>
</tr>
<tr>
<td>启动容器</td>
<td>docker start</td>
<td>crictl start</td>
</tr>
<tr>
<td>停止容器</td>
<td>docker stop</td>
<td>crictl stop</td>
</tr>
<tr>
<td>删除容器</td>
<td>docker rm</td>
<td>crictl rm</td>
</tr>
<tr>
<td>查看容器详情</td>
<td>docker inspect</td>
<td>crictl inspect</td>
</tr>
<tr>
<td>attach</td>
<td>docker attach</td>
<td>crictl attach</td>
</tr>
<tr>
<td>exec</td>
<td>docker exec</td>
<td>crictl exec</td>
</tr>
<tr>
<td>logs</td>
<td>docker logs</td>
<td>crictl logs</td>
</tr>
<tr>
<td>stats</td>
<td>docker stats</td>
<td>crictl stats</td>
</tr>
</tbody>
</table>
<table>
<thead>
<tr>
<th>Pods相关操作</th>
<th>Docker</th>
<th>Containerd</th>
</tr>
</thead>
<tbody>
<tr>
<td>显示POD列表</td>
<td>无</td>
<td>crictl pods</td>
</tr>
<tr>
<td>查看POD详情</td>
<td>无</td>
<td>crictl inspectp</td>
</tr>
<tr>
<td>运行POD</td>
<td>无</td>
<td>crictl runp</td>
</tr>
<tr>
<td>停止POD</td>
<td>无</td>
<td>crictl stopp</td>
</tr>
</tbody>
</table>
<h2 id="容器日志和相关参数配置差异"><a href="#容器日志和相关参数配置差异" class="headerlink" title="容器日志和相关参数配置差异"></a>容器日志和相关参数配置差异</h2><table>
<thead>
<tr>
<th>功能</th>
<th>Docker</th>
<th>Containerd</th>
</tr>
</thead>
<tbody>
<tr>
<td>存储路径</td>
<td>如果 Docker 作为 K8S 容器运行时,容器日志的落盘将由 docker 来完成,保存在类似<code>/var/lib/docker/containers/$CONTAINERID</code> 目录下。Kubelet 会在 <code>/var/log/pods 和 /var/log/containers</code> 下面建立软链接,指向 <code>/var/lib/docker/containers/$CONTAINERID</code> 该目录下的容器日志文件。</td>
<td>如果 Containerd 作为 K8S 容器运行时, 容器日志的落盘由 Kubelet 来完成,保存至 <code>/var/log/pods/$CONTAINER_NAME</code> 目录下,同时在 <code>/var/log/containers</code> 目录下创建软链接,指向日志文件。</td>
</tr>
<tr>
<td>配置参数</td>
<td>在 docker 配置文件中指定:<code>"log-driver": "json-file", "log-opts": {"max-size": "100m","max-file": "5"}</code></td>
<td><code>方法一</code>:在 kubelet 参数中指定:<code>--container-log-max-files=5 --container-log-max-size="100Mi"</code> ;<code>方法二</code>:在 KubeletConfiguration 中指定:<code>"containerLogMaxSize": "100Mi", "containerLogMaxFiles": 5</code></td>
</tr>
<tr>
<td>容器日志保存到数据盘</td>
<td>把数据盘挂载到 “data-root”(缺省是 <code>/var/lib/docker</code>)即可。</td>
<td>创建一个软链接 <code>/var/log/pods</code> 指向数据盘挂载点下的某个目录 或者 通过挂载目录,把 <code>/var/log/pods</code> 目录挂载到数据盘上。</td>
</tr>
</tbody>
</table>
<h2 id="CNI-网络"><a href="#CNI-网络" class="headerlink" title="CNI 网络"></a>CNI 网络</h2><table>
<thead>
<tr>
<th>功能</th>
<th>Docker</th>
<th>Containerd</th>
</tr>
</thead>
<tbody>
<tr>
<td>谁负责调用 CNI</td>
<td>Kubelet 内部的 docker-shim</td>
<td>Containerd 内置的 cri-plugin(containerd 1.1 以后)</td>
</tr>
<tr>
<td>如何配置 CNI</td>
<td>Kubelet 参数 <code>--cni-bin-dir</code> 和 <code>--cni-conf-dir</code></td>
<td>Containerd 配置文件(toml): <code>[plugins.cri.cni]</code> <code>bin_dir = "/opt/cni/bin"</code> <code>conf_dir = "/etc/cni/net.d"</code></td>
</tr>
</tbody>
</table>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="https://cloud.tencent.com/document/product/457/35747" target="_blank" rel="external">https://cloud.tencent.com/document/product/457/35747</a></li>
</ul>
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p><code>Kubernetes</code> 在 <code>1.24</code> 版本里弃用并移除 <code>docker shim</code>,这导致 <code>1.24</code> 版本开始不在支持 <code>docker</code> 运行时。大部分用户会选择使用 <code>Containerd</code> 做为Kubernetes运行时。</p>
<blockquote>
<p>PS: <code>docker-ce</code> 底层就是 <code>Containerd</code></p>
</blockquote>
<p>使用 Containerd 时,kubelet 不需要通过 <code>docker shim</code> 调用,直接通过 <code>Container Runtime Interface (CRI)</code> 与容器运行时交互。减少调用层,并且也减少很多bug产生。</p>
<p>下面来讲讲 docker 与 Containerd 使用有那些方面不同。</p>
docker exec 失败问题排查之旅
https://www.yp14.cn/2022/01/09/docker-exec-失败问题排查之旅/
2022-01-09T04:28:56.000Z
2022-01-09T04:30:38.126Z
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>锄禾日当午,值班好辛苦;</p>
<p>汗滴禾下土,一查一下午。</p>
<h2 id="问题描述"><a href="#问题描述" class="headerlink" title="问题描述"></a>问题描述</h2><p>今天,在值班排查线上问题的过程中,发现系统日志一直在刷docker异常日志:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">May 12 09:08:40 HOSTNAME dockerd[4085]: time="2021-05-12T09:08:40.642410594+08:00" level=error msg="stream copy error: reading from a closed fifo"</span><br><span class="line">May 12 09:08:40 HOSTNAME dockerd[4085]: time="2021-05-12T09:08:40.642418571+08:00" level=error msg="stream copy error: reading from a closed fifo"</span><br><span class="line">May 12 09:08:40 HOSTNAME dockerd[4085]: time="2021-05-12T09:08:40.663754355+08:00" level=error msg="Error running exec 110deb1c1b2a2d2671d7368bd02bfc18a968e4712a3c771dedf0b362820e73cb in container: OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"read init-p: connection reset by peer\": unknown"</span><br></pre></td></tr></table></figure>
<p>从系统风险性上来看,异常日志出现的原因需要排查清楚,并摸清是否会对业务产生影响。</p>
<p>下文简单介绍问题排查的流程,以及产生的原因。</p>
<a id="more"></a>
<h2 id="问题排查"><a href="#问题排查" class="headerlink" title="问题排查"></a>问题排查</h2><p>现在我们唯一掌握的信息,只有系统日志告知dockerd执行exec失败。</p>
<p>在具体的问题分析之前,我们再来回顾一下docker的工作原理与调用链路:</p>
<p><img src="https://cdm.yp14.cn/img1/docker-call-path.png" alt=""></p>
<p>可见,docker的调用链路非常长,涉及组件也较多。因此,我们的排查路径主要分为如下两步:</p>
<ul>
<li>确定引起失败的组件</li>
<li>确定组件失败的原因</li>
</ul>
<h2 id="定位组件"><a href="#定位组件" class="headerlink" title="定位组件"></a>定位组件</h2><p>熟悉docker的用户能够一眼定位引起问题的组件。但是,我们还是按照常规的排查流程走一遍:</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 1. 定位问题容器</span></span><br><span class="line"># sudo docker ps | grep -v pause | grep -v NAMES | awk <span class="string">'{print $1}'</span> | xargs -ti sudo docker exec {} sleep <span class="number">1</span></span><br><span class="line">sudo docker exec aa1e331ec24f sleep <span class="number">1</span></span><br><span class="line">OCI runtime exec failed: exec failed: container_linux.<span class="keyword">go</span>:<span class="number">348</span>: starting container process caused <span class="string">"read init-p: connection reset by peer"</span>: unknown</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">// 2. 排除docker嫌疑</span></span><br><span class="line"># docker-containerd-ctr -a /<span class="keyword">var</span>/run/docker/containerd/docker-containerd.sock -n moby t exec --exec-id stupig1 aa1e331ec24f621ab3152ebe94f1e533734164af86c9df0f551eab2b1967ec4e sleep <span class="number">1</span></span><br><span class="line">ctr: OCI runtime exec failed: exec failed: container_linux.<span class="keyword">go</span>:<span class="number">348</span>: starting container process caused <span class="string">"read init-p: connection reset by peer"</span>: unknown</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">// 3. 排除containerd与containerd-shim嫌疑</span></span><br><span class="line"># docker-runc --root /<span class="keyword">var</span>/run/docker/runtime-runc/moby/ exec aa1e331ec24f621ab3152ebe94f1e533734164af86c9df0f551eab2b1967ec4e sleep</span><br><span class="line">runtime/cgo: pthread_create failed: Resource temporarily unavailable</span><br><span class="line">SIGABRT: abort</span><br><span class="line">PC=<span class="number">0x6b657e</span> m=<span class="number">0</span> sigcode=<span class="number">18446744073709551610</span></span><br><span class="line"></span><br><span class="line">goroutine <span class="number">0</span> [idle]:</span><br><span class="line">runtime: unknown pc <span class="number">0x6b657e</span></span><br><span class="line">stack: frame={sp:<span class="number">0x7ffd30f0d</span>218, fp:<span class="number">0x0</span>} stack=[<span class="number">0x7ffd</span>2ab0e738,<span class="number">0x7ffd30f0d</span>760)</span><br><span class="line"><span class="number">00007f</span>fd30f0d118: <span class="number">0000000000000002</span> <span class="number">00007f</span>fd30f7f184</span><br><span class="line"><span class="number">00007f</span>fd30f0d128: <span class="number">000000000069</span>c31c <span class="number">00007f</span>fd30f0d1a8</span><br><span class="line"><span class="number">00007f</span>fd30f0d138: <span class="number">000000000045814</span>e <runtime.callCgoMmap+<span class="number">62</span>> <span class="number">00007f</span>fd30f0d140</span><br><span class="line"><span class="number">00007f</span>fd30f0d148: <span class="number">00007f</span>fd30f0d190 <span class="number">0000000000411</span>a88 <runtime.persistentalloc1+<span class="number">456</span>></span><br><span class="line"><span class="number">00007f</span>fd30f0d158: <span class="number">0000000000</span>bf6dd0 <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d168: <span class="number">0000000000010000</span> <span class="number">0000000000000008</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d178: <span class="number">0000000000</span>bf6dd8 <span class="number">0000000000</span>bf7ca0</span><br><span class="line"><span class="number">00007f</span>fd30f0d188: <span class="number">00007f</span>dcbb4b7000 <span class="number">00007f</span>fd30f0d1c8</span><br><span class="line"><span class="number">00007f</span>fd30f0d198: <span class="number">0000000000451205</span> <runtime.persistentalloc.func1+<span class="number">69</span>> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d1a8: <span class="number">0000000000000000</span> <span class="number">0000000000</span>c1c080</span><br><span class="line"><span class="number">00007f</span>fd30f0d1b8: <span class="number">00007f</span>dcbb4b7000 <span class="number">00007f</span>fd30f0d1e0</span><br><span class="line"><span class="number">00007f</span>fd30f0d1c8: <span class="number">00007f</span>fd30f0d210 <span class="number">00007f</span>fd30f0d220</span><br><span class="line"><span class="number">00007f</span>fd30f0d1d8: <span class="number">0000000000000000</span> <span class="number">00000000000000f</span>1</span><br><span class="line"><span class="number">00007f</span>fd30f0d1e8: <span class="number">0000000000000011</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d1f8: <span class="number">000000000069</span>c31c <span class="number">0000000000</span>c1c080</span><br><span class="line"><span class="number">00007f</span>fd30f0d208: <span class="number">000000000045814</span>e <runtime.callCgoMmap+<span class="number">62</span>> <span class="number">00007f</span>fd30f0d210</span><br><span class="line"><span class="number">00007f</span>fd30f0d218: <<span class="number">00007f</span>fd30f0d268 fffffffe7fffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d228: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d238: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d248: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d258: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d268: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d278: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d288: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d298: ffffffffffffffff <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2a8: <span class="number">00000000006</span>b68ba <span class="number">0000000000000020</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2b8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2c8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2d8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2e8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2f8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d308: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line">runtime: unknown pc <span class="number">0x6b657e</span></span><br><span class="line">stack: frame={sp:<span class="number">0x7ffd30f0d</span>218, fp:<span class="number">0x0</span>} stack=[<span class="number">0x7ffd</span>2ab0e738,<span class="number">0x7ffd30f0d</span>760)</span><br><span class="line"><span class="number">00007f</span>fd30f0d118: <span class="number">0000000000000002</span> <span class="number">00007f</span>fd30f7f184</span><br><span class="line"><span class="number">00007f</span>fd30f0d128: <span class="number">000000000069</span>c31c <span class="number">00007f</span>fd30f0d1a8</span><br><span class="line"><span class="number">00007f</span>fd30f0d138: <span class="number">000000000045814</span>e <runtime.callCgoMmap+<span class="number">62</span>> <span class="number">00007f</span>fd30f0d140</span><br><span class="line"><span class="number">00007f</span>fd30f0d148: <span class="number">00007f</span>fd30f0d190 <span class="number">0000000000411</span>a88 <runtime.persistentalloc1+<span class="number">456</span>></span><br><span class="line"><span class="number">00007f</span>fd30f0d158: <span class="number">0000000000</span>bf6dd0 <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d168: <span class="number">0000000000010000</span> <span class="number">0000000000000008</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d178: <span class="number">0000000000</span>bf6dd8 <span class="number">0000000000</span>bf7ca0</span><br><span class="line"><span class="number">00007f</span>fd30f0d188: <span class="number">00007f</span>dcbb4b7000 <span class="number">00007f</span>fd30f0d1c8</span><br><span class="line"><span class="number">00007f</span>fd30f0d198: <span class="number">0000000000451205</span> <runtime.persistentalloc.func1+<span class="number">69</span>> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d1a8: <span class="number">0000000000000000</span> <span class="number">0000000000</span>c1c080</span><br><span class="line"><span class="number">00007f</span>fd30f0d1b8: <span class="number">00007f</span>dcbb4b7000 <span class="number">00007f</span>fd30f0d1e0</span><br><span class="line"><span class="number">00007f</span>fd30f0d1c8: <span class="number">00007f</span>fd30f0d210 <span class="number">00007f</span>fd30f0d220</span><br><span class="line"><span class="number">00007f</span>fd30f0d1d8: <span class="number">0000000000000000</span> <span class="number">00000000000000f</span>1</span><br><span class="line"><span class="number">00007f</span>fd30f0d1e8: <span class="number">0000000000000011</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d1f8: <span class="number">000000000069</span>c31c <span class="number">0000000000</span>c1c080</span><br><span class="line"><span class="number">00007f</span>fd30f0d208: <span class="number">000000000045814</span>e <runtime.callCgoMmap+<span class="number">62</span>> <span class="number">00007f</span>fd30f0d210</span><br><span class="line"><span class="number">00007f</span>fd30f0d218: <<span class="number">00007f</span>fd30f0d268 fffffffe7fffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d228: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d238: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d248: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d258: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d268: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d278: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d288: ffffffffffffffff ffffffffffffffff</span><br><span class="line"><span class="number">00007f</span>fd30f0d298: ffffffffffffffff <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2a8: <span class="number">00000000006</span>b68ba <span class="number">0000000000000020</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2b8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2c8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2d8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2e8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d2f8: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"><span class="number">00007f</span>fd30f0d308: <span class="number">0000000000000000</span> <span class="number">0000000000000000</span></span><br><span class="line"></span><br><span class="line">goroutine <span class="number">1</span> [running]:</span><br><span class="line">runtime.systemstack_switch()</span><br><span class="line"> /usr/local/<span class="keyword">go</span>/src/runtime/asm_amd64.s:<span class="number">363</span> fp=<span class="number">0xc4200f</span>e788 sp=<span class="number">0xc4200f</span>e780 pc=<span class="number">0x454120</span></span><br><span class="line">runtime.main()</span><br><span class="line"> /usr/local/<span class="keyword">go</span>/src/runtime/proc.<span class="keyword">go</span>:<span class="number">128</span> +<span class="number">0x63</span> fp=<span class="number">0xc4200f</span>e7e0 sp=<span class="number">0xc4200f</span>e788 pc=<span class="number">0x42bb83</span></span><br><span class="line">runtime.goexit()</span><br><span class="line"> /usr/local/<span class="keyword">go</span>/src/runtime/asm_amd64.s:<span class="number">2361</span> +<span class="number">0x1</span> fp=<span class="number">0xc4200f</span>e7e8 sp=<span class="number">0xc4200f</span>e7e0 pc=<span class="number">0x456c91</span></span><br><span class="line"></span><br><span class="line">rax <span class="number">0x0</span></span><br><span class="line">rbx <span class="number">0xbe2978</span></span><br><span class="line">rcx <span class="number">0x6b657e</span></span><br><span class="line">rdx <span class="number">0x0</span></span><br><span class="line">rdi <span class="number">0x2</span></span><br><span class="line">rsi <span class="number">0x7ffd30f0d</span>1a0</span><br><span class="line">rbp <span class="number">0x8347ce</span></span><br><span class="line">rsp <span class="number">0x7ffd30f0d</span>218</span><br><span class="line">r8 <span class="number">0x0</span></span><br><span class="line">r9 <span class="number">0x6</span></span><br><span class="line">r10 <span class="number">0x8</span></span><br><span class="line">r11 <span class="number">0x246</span></span><br><span class="line">r12 <span class="number">0x2bed</span>c30</span><br><span class="line">r13 <span class="number">0xf1</span></span><br><span class="line">r14 <span class="number">0x11</span></span><br><span class="line">r15 <span class="number">0x0</span></span><br><span class="line">rip <span class="number">0x6b657e</span></span><br><span class="line">rflags <span class="number">0x246</span></span><br><span class="line">cs <span class="number">0x33</span></span><br><span class="line">fs <span class="number">0x0</span></span><br><span class="line">gs <span class="number">0x0</span></span><br><span class="line">exec failed: container_linux.<span class="keyword">go</span>:<span class="number">348</span>: starting container process caused <span class="string">"read init-p: connection reset by peer"</span></span><br></pre></td></tr></table></figure>
<p>由上可知,异常是runc返回的。</p>
<h2 id="定位原因"><a href="#定位原因" class="headerlink" title="定位原因"></a>定位原因</h2><p>定位异常组件的同时,runc还给了我们一个惊喜:提供了详细的异常日志。</p>
<p>异常日志表明:runc exec失败的原因是因为 Resource temporarily unavailable,比较典型的资源不足问题。而常见的资源不足类型主要包含(ulimit -a):</p>
<ul>
<li>线程数达到限制</li>
<li>文件数达到限制</li>
<li>内存达到限制</li>
</ul>
<p>因此,我们需要进一步排查业务容器的监控,以定位不足的资源类型。</p>
<p><img src="https://cdm.yp14.cn/img1/thread-monitor.png" alt=""></p>
<p>上图展示了业务容器的线程数监控。所有容器的线程数都已经达到1w,而弹性云默认限制容器的线程数上限就是1w,设定该上限的原因,也是为了避免单容器线程泄漏而耗尽宿主机的线程资源。</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># cat /sys/fs/cgroup/pids/kubepods/burstable/pod64a6c0e7-830c-11eb-86d6-b8cef604db88/aa1e331ec24f621ab3152ebe94f1e533734164af86c9df0f551eab2b1967ec4e/pids.max</span></span><br><span class="line"></span><br><span class="line">10000</span><br></pre></td></tr></table></figure>
<p>至此,问题的原因已定位清楚,对,就是这么简单。</p>
<h2 id="runc梳理"><a href="#runc梳理" class="headerlink" title="runc梳理"></a>runc梳理</h2><p>虽然,我们已经定位了异常日志的成因,但是,对于runc的具体工作机制,一直只有一个模糊的概念。</p>
<p>趁此机会,我们以runc exec为例,梳理runc的工作流程。</p>
<ul>
<li>runc exec首先启动子进程runc init</li>
<li>runc init负责初始化容器namespace<ul>
<li>runc init利用C语言的constructor特性,实现在go代码启动之前,设置容器namespace</li>
<li>C代码nsexec执行两次clone,共三个线程:父进程,子进程,孙进程,完成对容器namespace的初始化</li>
<li>父进程与子进程完成初始化任务后退出,此时,孙进程已经在容器namespace内,孙进程开始执行go代码初始化,并等待接收runc exec发送配置</li>
</ul>
</li>
<li>runc exec将孙进程添加到容器cgroup</li>
<li>runc exec发送配置给孙进程,配置主要包含:exec的具体命令与参数等</li>
<li>孙进程调用system.Execv执行用户命令</li>
</ul>
<p>注意:</p>
<ul>
<li>步骤2.c与步骤3是并发执行的</li>
<li>runc exec与runc init通信基于socket pair对(init-p和init-c)</li>
</ul>
<p>runc exec过程中各进程的交互流程,以及namespace与cgroup的初始化参见下图:</p>
<p><img src="https://cdm.yp14.cn/img1/runc-detail.png" alt=""></p>
<p>综合我们对runc exec执行流程的梳理,以及runc exec返回的错误信息,我们基本定位到了runc exec返回错误的代码:</p>
<figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(p *setnsProcess)</span> <span class="title">start</span><span class="params">()</span> <span class="params">(err error)</span></span> {</span><br><span class="line"> <span class="keyword">defer</span> p.parentPipe.Close()</span><br><span class="line"> err = p.cmd.Start()</span><br><span class="line"> p.childPipe.Close()</span><br><span class="line"> <span class="keyword">if</span> err != <span class="literal">nil</span> {</span><br><span class="line"> <span class="keyword">return</span> newSystemErrorWithCause(err, <span class="string">"starting setns process"</span>)</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> p.bootstrapData != <span class="literal">nil</span> {</span><br><span class="line"> <span class="keyword">if</span> _, err := io.Copy(p.parentPipe, p.bootstrapData); err != <span class="literal">nil</span> { <span class="comment">// clone标志位,ns配置</span></span><br><span class="line"> <span class="keyword">return</span> newSystemErrorWithCause(err, <span class="string">"copying bootstrap data to pipe"</span>)</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> err = p.execSetns(); err != <span class="literal">nil</span> {</span><br><span class="line"> <span class="keyword">return</span> newSystemErrorWithCause(err, <span class="string">"executing setns process"</span>)</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> <span class="built_in">len</span>(p.cgroupPaths) > <span class="number">0</span> {</span><br><span class="line"> <span class="keyword">if</span> err := cgroups.EnterPid(p.cgroupPaths, p.pid()); err != <span class="literal">nil</span> { <span class="comment">// 这里将runc init添加到容器cgroup中</span></span><br><span class="line"> <span class="keyword">return</span> newSystemErrorWithCausef(err, <span class="string">"adding pid %d to cgroups"</span>, p.pid())</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">if</span> err := utils.WriteJSON(p.parentPipe, p.config); err != <span class="literal">nil</span> { <span class="comment">// 发送配置:命令、环境变量等</span></span><br><span class="line"> <span class="keyword">return</span> newSystemErrorWithCause(err, <span class="string">"writing config to pipe"</span>)</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> ierr := parseSync(p.parentPipe, <span class="function"><span class="keyword">func</span><span class="params">(sync *syncT)</span> <span class="title">error</span></span> { <span class="comment">// 这里返回 read init-p: connection reset by peer</span></span><br><span class="line"> <span class="keyword">switch</span> sync.Type {</span><br><span class="line"> <span class="keyword">case</span> procReady:</span><br><span class="line"> <span class="comment">// This shouldn't happen.</span></span><br><span class="line"> <span class="built_in">panic</span>(<span class="string">"unexpected procReady in setns"</span>)</span><br><span class="line"> <span class="keyword">case</span> procHooks:</span><br><span class="line"> <span class="comment">// This shouldn't happen.</span></span><br><span class="line"> <span class="built_in">panic</span>(<span class="string">"unexpected procHooks in setns"</span>)</span><br><span class="line"> <span class="keyword">default</span>:</span><br><span class="line"> <span class="keyword">return</span> newSystemError(fmt.Errorf(<span class="string">"invalid JSON payload from child"</span>))</span><br><span class="line"> }</span><br><span class="line"> })</span><br><span class="line"> <span class="keyword">if</span> ierr != <span class="literal">nil</span> {</span><br><span class="line"> p.wait()</span><br><span class="line"> <span class="keyword">return</span> ierr</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>现在,问题的成因与代码分析已全部完成。</p>
<h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul>
<li><a href="https://www.kernel.org/doc/Documentation/cgroup-v1/pids.txt" target="_blank" rel="external">https://www.kernel.org/doc/Documentation/cgroup-v1/pids.txt</a></li>
<li><a href="https://github.com/opencontainers/runc" target="_blank" rel="external">https://github.com/opencontainers/runc</a></li>
</ul>
<blockquote>
<ul>
<li>作者:plpan</li>
<li>原文出处:<a href="https://plpan.github.io/docker-exec-%E5%A4%B1%E8%B4%A5%E9%97%AE%E9%A2%98%E6%8E%92%E6%9F%A5%E4%B9%8B%E6%97%85/" target="_blank" rel="external">https://plpan.github.io/docker-exec-%E5%A4%B1%E8%B4%A5%E9%97%AE%E9%A2%98%E6%8E%92%E6%9F%A5%E4%B9%8B%E6%97%85/</a></li>
</ul>
</blockquote>
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>锄禾日当午,值班好辛苦;</p>
<p>汗滴禾下土,一查一下午。</p>
<h2 id="问题描述"><a href="#问题描述" class="headerlink" title="问题描述"></a>问题描述</h2><p>今天,在值班排查线上问题的过程中,发现系统日志一直在刷docker异常日志:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">May 12 09:08:40 HOSTNAME dockerd[4085]: time="2021-05-12T09:08:40.642410594+08:00" level=error msg="stream copy error: reading from a closed fifo"</span><br><span class="line">May 12 09:08:40 HOSTNAME dockerd[4085]: time="2021-05-12T09:08:40.642418571+08:00" level=error msg="stream copy error: reading from a closed fifo"</span><br><span class="line">May 12 09:08:40 HOSTNAME dockerd[4085]: time="2021-05-12T09:08:40.663754355+08:00" level=error msg="Error running exec 110deb1c1b2a2d2671d7368bd02bfc18a968e4712a3c771dedf0b362820e73cb in container: OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"read init-p: connection reset by peer\": unknown"</span><br></pre></td></tr></table></figure>
<p>从系统风险性上来看,异常日志出现的原因需要排查清楚,并摸清是否会对业务产生影响。</p>
<p>下文简单介绍问题排查的流程,以及产生的原因。</p>
阿里云ACK多个Service绑定单个SLB实践
https://www.yp14.cn/2021/12/08/阿里云ACK多个Service绑定单个SLB实践/
2021-12-08T03:59:26.000Z
2021-12-08T04:00:05.417Z
<h2 id="阿里云ACK是什么"><a href="#阿里云ACK是什么" class="headerlink" title="阿里云ACK是什么"></a>阿里云ACK是什么</h2><p>阿里云容器服务Kubernetes版(Alibaba Cloud Container Service for Kubernetes,简称容器服务ACK)是全球首批通过Kubernetes一致性认证的服务平台,提供高性能的容器应用管理服务,支持企业级Kubernetes容器化应用的生命周期管理,让您轻松高效地在云端运行Kubernetes容器化应用。</p>
<h2 id="Service几种暴露方式"><a href="#Service几种暴露方式" class="headerlink" title="Service几种暴露方式"></a>Service几种暴露方式</h2><p>Kubernetes Service 支持下面一些暴露方式:</p>
<ul>
<li><code>NodePort</code>:通过每个节点上的 IP 和静态端口(NodePort)暴露服务。 NodePort 服务会路由到自动创建的 ClusterIP 服务。 通过请求 <节点 IP>:<节点端口>,你可以从集群的外部访问一个 NodePort 服务。</li>
<li><code>hostNetwork: true</code>:Pod中运行的应用程序可以直接看到pod启动主机网络接口。</li>
<li><code>hostPort</code>:是直接将容器端口与所调度节点上的端口路由,这样用户就可以通过宿主机IP加端口来访问Pod。</li>
<li><code>LoadBalancer</code>:使用云提供商的负载均衡器向外部暴露服务。 外部负载均衡器可以将流量路由到自动创建的 NodePort 服务和 ClusterIP 服务上。</li>
<li><code>ExternalName</code>:通过返回 CNAME 和对应值,可以将服务映射到 externalName 字段的内容(例如,foo.bar.example.com)。 无需创建任何类型代理。</li>
<li><code>Ingress</code>:是自kubernetes1.1版本后引入的资源类型。必须要部署Ingress controller才能创建Ingress资源,Ingress controller是以一种插件的形式提供。Ingress controller 是部署在Kubernetes之上的Docker容器。它的Docker镜像包含一个像nginx或HAProxy的负载均衡器和一个控制器守护进程。控制器守护程序从Kubernetes接收所需的Ingress配置。它会生成一个nginx或HAProxy配置文件,并重新启动负载平衡器进程以使更改生效。换句话说,Ingress controller是由Kubernetes管理的负载均衡器。</li>
</ul>
<a id="more"></a>
<h2 id="需求"><a href="#需求" class="headerlink" title="需求"></a>需求</h2><p>使用阿里云<code>ACK容器服务</code>时,我们Service默认就支持缓存<code>LoadBalancer</code>,大家有可能第一种念想,每个Service绑定一个SLB,这样会不会太浪费SLB。也可以通过<code>Ingress</code>来实现外部用户访问K8S集群内部。</p>
<p>Ingress是可以实现,但如果我们业务中有很多不是80或者443端口访问的,并且还在一个域名,比如下面:</p>
<ul>
<li>www.example.com:8888</li>
<li>www.example.com:8080</li>
<li>www.example.com:8081</li>
</ul>
<p>这种一般不推荐使用Ingress来实现,因为这样Ingress会开放很多端口,以后不便于维护。</p>
<p>那有没有在SLB充分利用前提下,实现上面的需求。方法当然有的,<code>阿里云ACK支持多个Service绑定一个SLB多个端口</code>。</p>
<h2 id="多个Service绑定一个SLB多个端口用法"><a href="#多个Service绑定一个SLB多个端口用法" class="headerlink" title="多个Service绑定一个SLB多个端口用法"></a>多个Service绑定一个SLB多个端口用法</h2><blockquote>
<p>PS: <code>前提条件</code>:需要提前在SLB控制台创建SLB,SLB需要和K8S在同一个VPC网络下</p>
</blockquote>
<h3 id="Service声明TCP协议"><a href="#Service声明TCP协议" class="headerlink" title="Service声明TCP协议"></a>Service声明TCP协议</h3><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> v1</span><br><span class="line"><span class="attr">kind:</span> Service</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-force-override-listeners: <span class="string">"true"</span></span><br><span class="line"> <span class="comment"># 仅支持TCP和UDP协议。如需设置连接优雅中断,以下两项Annotation必选</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain: <span class="string">"on"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain-timeout: <span class="string">"30"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-delete-protection: <span class="string">"on"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-modification-protection: <span class="string">"ConsoleProtection"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id: <span class="string">"slb-id"</span></span><br><span class="line"> <span class="comment"># ACK是Terway网络模式下,通过annotation:service.beta.kubernetes.io/backend-type:"eni"将Pod直接挂载到SLB后端,提升网络转发性能。</span></span><br><span class="line"> service.beta.kubernetes.io/backend-type: <span class="string">"eni"</span></span><br><span class="line"><span class="attr"> name:</span> nginx</span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> externalTrafficPolicy:</span> Cluster</span><br><span class="line"><span class="attr"> ports:</span></span><br><span class="line"><span class="attr"> - port:</span> <span class="number">80</span></span><br><span class="line"><span class="attr"> protocol:</span> TCP</span><br><span class="line"><span class="attr"> targetPort:</span> <span class="number">80</span></span><br><span class="line"><span class="attr"> selector:</span></span><br><span class="line"><span class="attr"> app:</span> nginx</span><br><span class="line"><span class="attr"> type:</span> LoadBalancer</span><br></pre></td></tr></table></figure>
<h3 id="Service声明使用https协议"><a href="#Service声明使用https协议" class="headerlink" title="Service声明使用https协议"></a>Service声明使用https协议</h3><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> v1</span><br><span class="line"><span class="attr">kind:</span> Service</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-protocol-port: <span class="string">"https:8888"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-flag: <span class="string">"on"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-type: <span class="string">"http"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-uri: <span class="string">"/health"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-healthy-threshold: <span class="string">"4"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-unhealthy-threshold: <span class="string">"4"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-timeout: <span class="string">"10"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-interval: <span class="string">"3"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-cert-id: <span class="string">"证书id"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-force-override-listeners: <span class="string">"true"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-delete-protection: <span class="string">"on"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-modification-protection: <span class="string">"ConsoleProtection"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id: <span class="string">"slb-id"</span></span><br><span class="line"> <span class="comment"># ACK是Terway网络模式下,通过annotation:service.beta.kubernetes.io/backend-type:"eni"将Pod直接挂载到SLB后端,提升网络转发性能。</span></span><br><span class="line"> service.beta.kubernetes.io/backend-type: eni</span><br><span class="line"><span class="attr"> name:</span> nginx</span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> externalTrafficPolicy:</span> Cluster</span><br><span class="line"><span class="attr"> ports:</span></span><br><span class="line"><span class="attr"> - port:</span> <span class="number">80</span></span><br><span class="line"><span class="attr"> protocol:</span> TCP</span><br><span class="line"><span class="attr"> targetPort:</span> <span class="number">80</span></span><br><span class="line"><span class="attr"> selector:</span></span><br><span class="line"><span class="attr"> app:</span> nginx</span><br><span class="line"><span class="attr"> type:</span> LoadBalancer</span><br></pre></td></tr></table></figure>
<h3 id="Service声明使用http协议"><a href="#Service声明使用http协议" class="headerlink" title="Service声明使用http协议"></a>Service声明使用http协议</h3><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> v1</span><br><span class="line"><span class="attr">kind:</span> Service</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-protocol-port: <span class="string">"http:8080"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-flag: <span class="string">"on"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-type: <span class="string">"http"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-uri: <span class="string">"/health"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-healthy-threshold: <span class="string">"4"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-unhealthy-threshold: <span class="string">"4"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-timeout: <span class="string">"10"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-interval: <span class="string">"3"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-force-override-listeners: <span class="string">"true"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-delete-protection: <span class="string">"on"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-modification-protection: <span class="string">"ConsoleProtection"</span></span><br><span class="line"> service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id: <span class="string">"slb-id"</span></span><br><span class="line"> <span class="comment"># ACK是Terway网络模式下,通过annotation:service.beta.kubernetes.io/backend-type:"eni"将Pod直接挂载到SLB后端,提升网络转发性能。</span></span><br><span class="line"> service.beta.kubernetes.io/backend-type: eni</span><br><span class="line"><span class="attr"> name:</span> nginx</span><br><span class="line"><span class="attr"> namespace:</span> default</span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> externalTrafficPolicy:</span> Cluster</span><br><span class="line"><span class="attr"> ports:</span></span><br><span class="line"><span class="attr"> - port:</span> <span class="number">80</span></span><br><span class="line"><span class="attr"> protocol:</span> TCP</span><br><span class="line"><span class="attr"> targetPort:</span> <span class="number">80</span></span><br><span class="line"><span class="attr"> selector:</span>b</span><br><span class="line"><span class="attr"> run:</span> nginx</span><br><span class="line"><span class="attr"> type:</span> LoadBalancer</span><br></pre></td></tr></table></figure>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="https://help.aliyun.com/document_detail/86531.html?spm=5176.21213303.J_6704733920.7.c5983eda1xJwdH&scm=20140722.S_help%40%40%E6%96%87%E6%A1%A3%40%4086531.S_0%2Bos0.ID_86531-RL_serviceDOTbetaDOTkubernetes-OR_helpmain-V_2-P0_0" target="_blank" rel="external">https://help.aliyun.com/document_detail/86531.html?spm=5176.21213303.J_6704733920.7.c5983eda1xJwdH&scm=20140722.S_help%40%40%E6%96%87%E6%A1%A3%40%4086531.S_0%2Bos0.ID_86531-RL_serviceDOTbetaDOTkubernetes-OR_helpmain-V_2-P0_0</a></li>
<li><a href="https://help.aliyun.com/document_detail/181517.html" target="_blank" rel="external">https://help.aliyun.com/document_detail/181517.html</a></li>
<li><a href="https://kubernetes.io/zh/docs/concepts/services-networking/service/#externalname" target="_blank" rel="external">https://kubernetes.io/zh/docs/concepts/services-networking/service/#externalname</a></li>
</ul>
<h2 id="阿里云ACK是什么"><a href="#阿里云ACK是什么" class="headerlink" title="阿里云ACK是什么"></a>阿里云ACK是什么</h2><p>阿里云容器服务Kubernetes版(Alibaba Cloud Container Service for Kubernetes,简称容器服务ACK)是全球首批通过Kubernetes一致性认证的服务平台,提供高性能的容器应用管理服务,支持企业级Kubernetes容器化应用的生命周期管理,让您轻松高效地在云端运行Kubernetes容器化应用。</p>
<h2 id="Service几种暴露方式"><a href="#Service几种暴露方式" class="headerlink" title="Service几种暴露方式"></a>Service几种暴露方式</h2><p>Kubernetes Service 支持下面一些暴露方式:</p>
<ul>
<li><code>NodePort</code>:通过每个节点上的 IP 和静态端口(NodePort)暴露服务。 NodePort 服务会路由到自动创建的 ClusterIP 服务。 通过请求 <节点 IP>:<节点端口>,你可以从集群的外部访问一个 NodePort 服务。</li>
<li><code>hostNetwork: true</code>:Pod中运行的应用程序可以直接看到pod启动主机网络接口。</li>
<li><code>hostPort</code>:是直接将容器端口与所调度节点上的端口路由,这样用户就可以通过宿主机IP加端口来访问Pod。</li>
<li><code>LoadBalancer</code>:使用云提供商的负载均衡器向外部暴露服务。 外部负载均衡器可以将流量路由到自动创建的 NodePort 服务和 ClusterIP 服务上。</li>
<li><code>ExternalName</code>:通过返回 CNAME 和对应值,可以将服务映射到 externalName 字段的内容(例如,foo.bar.example.com)。 无需创建任何类型代理。</li>
<li><code>Ingress</code>:是自kubernetes1.1版本后引入的资源类型。必须要部署Ingress controller才能创建Ingress资源,Ingress controller是以一种插件的形式提供。Ingress controller 是部署在Kubernetes之上的Docker容器。它的Docker镜像包含一个像nginx或HAProxy的负载均衡器和一个控制器守护进程。控制器守护程序从Kubernetes接收所需的Ingress配置。它会生成一个nginx或HAProxy配置文件,并重新启动负载平衡器进程以使更改生效。换句话说,Ingress controller是由Kubernetes管理的负载均衡器。</li>
</ul>
K8S部署分布式调度任务Airflow
https://www.yp14.cn/2021/11/28/K8S部署分布式调度任务Airflow/
2021-11-28T07:46:43.000Z
2021-11-28T07:48:33.633Z
<h2 id="一、部署要求"><a href="#一、部署要求" class="headerlink" title="一、部署要求"></a>一、部署要求</h2><p>Apache Airflow 已通过以下测试:</p>
<table>
<thead>
<tr>
<th></th>
<th>Main version (dev)</th>
<th>Stable version (2.1.4)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Python</td>
<td>3.6, 3.7, 3.8, 3.9</td>
<td>3.6, 3.7, 3.8, 3.9</td>
</tr>
<tr>
<td>Kubernetes</td>
<td>1.20, 1.19, 1.18</td>
<td>1.20, 1.19, 1.18</td>
</tr>
<tr>
<td>PostgreSQL</td>
<td>9.6, 10, 11, 12, 13</td>
<td>9.6, 10, 11, 12, 13</td>
</tr>
<tr>
<td>MySQL</td>
<td>5.7, 8</td>
<td>5.7, 8</td>
</tr>
<tr>
<td>SQLite</td>
<td>3.15.0+</td>
<td>3.15.0+</td>
</tr>
<tr>
<td>MSSQL(Experimental)</td>
<td>2017,2019</td>
</tr>
</tbody>
</table>
<p><strong>注意:</strong> MySQL 5.x 版本不能或有运行多个调度程序的限制——请参阅调度程序文档。MariaDB 未经过测试/推荐。</p>
<p><strong>注意:</strong> SQLite 用于 Airflow 测试。不要在生产中使用它。我们建议使用最新的 SQLite 稳定版本进行本地开发。</p>
<blockquote>
<p>PS:本文部署 <code>Airflow</code> 稳定版 <code>2.1.4</code>,<code>Kubernetes</code>使用<code>1.20.x</code>版本,<code>PostgreSQL</code>使用<code>12.x</code>,使用<code>Helm Charts</code>部署。</p>
</blockquote>
<a id="more"></a>
<h2 id="二、生成Helm-Charts配置"><a href="#二、生成Helm-Charts配置" class="headerlink" title="二、生成Helm Charts配置"></a>二、生成Helm Charts配置</h2><blockquote>
<p>PS:使用 helm 3 版本部署</p>
</blockquote>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 创建kubernetes airflow 命名空间</span></span><br><span class="line">$ kubectl create namespace airflow</span><br><span class="line"></span><br><span class="line"><span class="comment"># 添加 airflow charts 仓库源</span></span><br><span class="line">$ helm repo add apache-airflow https://airflow.apache.org</span><br><span class="line"></span><br><span class="line"><span class="comment"># 更新 aiarflow 源</span></span><br><span class="line">$ helm repo update</span><br><span class="line"></span><br><span class="line"><span class="comment"># 查看 airflow charts 所有版本(这里选择部署charts 1.2.0,也就是airflow 2.1.4)</span></span><br><span class="line">$ helm search repo apache-airflow/airflow <span class="_">-l</span></span><br><span class="line"></span><br><span class="line">NAME CHART VERSION APP VERSION DESCRIPTION</span><br><span class="line">apache-airflow/airflow 1.3.0 2.2.1 The official Helm chart to deploy Apache Airflo...</span><br><span class="line">apache-airflow/airflow 1.2.0 2.1.4 The official Helm chart to deploy Apache Airflo...</span><br><span class="line">apache-airflow/airflow 1.1.0 2.1.2 The official Helm chart to deploy Apache Airflo...</span><br><span class="line">apache-airflow/airflow 1.0.0 2.0.2 Helm chart to deploy Apache Airflow, a platform...</span><br><span class="line"></span><br><span class="line"><span class="comment"># 导出 airflow charts values.yaml 文件</span></span><br><span class="line">$ helm show values apache-airflow/airflow --version 1.2.0 > airflow_1.2.4_values.yaml</span><br></pre></td></tr></table></figure>
<h2 id="三、修改airflow配置"><a href="#三、修改airflow配置" class="headerlink" title="三、修改airflow配置"></a>三、修改airflow配置</h2><h3 id="3-1-配置持续存储-StorageClass"><a href="#3-1-配置持续存储-StorageClass" class="headerlink" title="3.1 配置持续存储 StorageClass"></a>3.1 配置持续存储 StorageClass</h3><blockquote>
<p>PS: 使用阿里云<code>NAS极速存储</code></p>
</blockquote>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 编辑 StorageClass 文件</span></span><br><span class="line">$ vim alicloud-nas-airflow-test.yaml</span><br><span class="line"></span><br><span class="line">apiVersion: storage.k8s.io/v1</span><br><span class="line">kind: StorageClass</span><br><span class="line">metadata:</span><br><span class="line"> name: alicloud-nas-airflow-test</span><br><span class="line">mountOptions:</span><br><span class="line"> - nolock,tcp,noresvport</span><br><span class="line"> - vers=3</span><br><span class="line">parameters:</span><br><span class="line"> volumeAs: subpath</span><br><span class="line"> server: <span class="string">"xxxxx.cn-beijing.extreme.nas.aliyuncs.com:/share/airflow/"</span></span><br><span class="line">provisioner: nasplugin.csi.alibabacloud.com</span><br><span class="line">reclaimPolicy: Retain</span><br><span class="line"></span><br><span class="line"><span class="comment"># 应用到K8S中</span></span><br><span class="line">$ kubectl apply <span class="_">-f</span> alicloud-nas-airflow-test.yaml</span><br></pre></td></tr></table></figure>
<h3 id="3-2-配置-airflow-Dags-存储仓库-gitSshKey"><a href="#3-2-配置-airflow-Dags-存储仓库-gitSshKey" class="headerlink" title="3.2 配置 airflow Dags 存储仓库 gitSshKey"></a>3.2 配置 airflow Dags 存储仓库 gitSshKey</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 编辑 airflow-ssh-secret.yaml 文件,首先需要把shh公钥添加到git项目仓库中</span></span><br><span class="line">$ vim airflow-ssh-secret.yaml</span><br><span class="line"></span><br><span class="line">apiVersion: v1</span><br><span class="line">kind: Secret</span><br><span class="line">metadata:</span><br><span class="line"> name: airflow-ssh-secret</span><br><span class="line"> namespace: airflow</span><br><span class="line">data:</span><br><span class="line"> <span class="comment"># key needs to be gitSshKey</span></span><br><span class="line"> gitSshKey: <span class="string">"ssh私钥,base64"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 应用到K8S中</span></span><br><span class="line">$ kubectl apply <span class="_">-f</span> airflow-ssh-secret.yaml</span><br></pre></td></tr></table></figure>
<h3 id="3-3-Docker-部署-PostgreSQL-12"><a href="#3-3-Docker-部署-PostgreSQL-12" class="headerlink" title="3.3 Docker 部署 PostgreSQL 12"></a>3.3 Docker 部署 PostgreSQL 12</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 创建 postgresql 存储目录</span></span><br><span class="line">$ mkdir /data/postgresql_data</span><br><span class="line"></span><br><span class="line"><span class="comment"># 创建启动文件 </span></span><br><span class="line">$ vim docker-compose.yaml</span><br><span class="line"></span><br><span class="line">version: <span class="string">"3"</span></span><br><span class="line"></span><br><span class="line">services:</span><br><span class="line"> airflow-postgres:</span><br><span class="line"> image: postgres:12</span><br><span class="line"> restart: always</span><br><span class="line"> container_name: airflow-postgres</span><br><span class="line"> environment:</span><br><span class="line"> TZ: Asia/Shanghai</span><br><span class="line"> POSTGRES_USER: airflow</span><br><span class="line"> POSTGRES_PASSWORD: Airflow123</span><br><span class="line"> volumes:</span><br><span class="line"> - /data/postgresql_data:/var/lib/postgresql/data</span><br><span class="line"> ports:</span><br><span class="line"> - <span class="string">"5432:5432"</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动 postgresql docker</span></span><br><span class="line">$ docker-compose up <span class="_">-d</span></span><br></pre></td></tr></table></figure>
<h3 id="3-4-修改-airflow-1-2-4-values-yaml-配置"><a href="#3-4-修改-airflow-1-2-4-values-yaml-配置" class="headerlink" title="3.4 修改 airflow_1.2.4_values.yaml 配置"></a>3.4 修改 airflow_1.2.4_values.yaml 配置</h3><blockquote>
<p>PS:本文 airflow_1.2.4_values.yaml 配置文件需要三个pvc,服务分别是 redis、worker(只部署1个worker,可以部署多个worker)、dags</p>
</blockquote>
<p>因配置文件太长,不具体贴出,具体内容请参考下面链接:</p>
<p><a href="https://github.com/yangpeng14/DevOps/blob/master/config_dir/airflow_1.2.4_values.yaml" target="_blank" rel="external">https://github.com/yangpeng14/DevOps/blob/master/config_dir/airflow_1.2.4_values.yaml</a></p>
<h2 id="四、部署-Airfolw"><a href="#四、部署-Airfolw" class="headerlink" title="四、部署 Airfolw"></a>四、部署 Airfolw</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 第一次部署 Airflow</span></span><br><span class="line">$ helm install airflow apache-airflow/airflow --namespace airflow --version 1.2.0 <span class="_">-f</span> airflow_1.2.4_values.yaml</span><br><span class="line"></span><br><span class="line"><span class="comment"># 以后如果要修改airflow配置,请使用下面命令</span></span><br><span class="line">$ helm upgrade --install airflow apache-airflow/airflow --namespace airflow --version 1.2.0 <span class="_">-f</span> airflow_1.2.4_values.yaml</span><br></pre></td></tr></table></figure>
<h2 id="五、配置-Airflow-Ingress-Nginx-访问入口"><a href="#五、配置-Airflow-Ingress-Nginx-访问入口" class="headerlink" title="五、配置 Airflow Ingress Nginx 访问入口"></a>五、配置 Airflow Ingress Nginx 访问入口</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 生成 ingress nginx 配置文件</span></span><br><span class="line">$ vim airflow-ingress.yaml</span><br><span class="line"></span><br><span class="line">apiVersion: networking.k8s.io/v1</span><br><span class="line">kind: Ingress</span><br><span class="line">metadata:</span><br><span class="line"> name: airflow</span><br><span class="line"> namespace: airflow</span><br><span class="line"> annotations:</span><br><span class="line"> kubernetes.io/ingress.class: nginx</span><br><span class="line"> nginx.ingress.kubernetes.io/ssl-redirect: <span class="string">"false"</span></span><br><span class="line"> nginx.ingress.kubernetes.io/proxy-connect-timeout: <span class="string">"60"</span></span><br><span class="line"> nginx.ingress.kubernetes.io/proxy-read-timeout: <span class="string">"60"</span></span><br><span class="line"> nginx.ingress.kubernetes.io/proxy-send-timeout: <span class="string">"60"</span></span><br><span class="line">spec:</span><br><span class="line"> rules:</span><br><span class="line"> - host: <span class="string">"airflow.example.com"</span></span><br><span class="line"> http:</span><br><span class="line"> paths:</span><br><span class="line"> - path: /</span><br><span class="line"> pathType: Prefix</span><br><span class="line"> backend:</span><br><span class="line"> service:</span><br><span class="line"> name: airflow-webserver</span><br><span class="line"> port:</span><br><span class="line"> number: 8080</span><br><span class="line"></span><br><span class="line"><span class="comment"># 应用到K8S中</span></span><br><span class="line">$ kubectl apply <span class="_">-f</span> airflow-ingress.yaml</span><br></pre></td></tr></table></figure>
<h2 id="六、参考链接"><a href="#六、参考链接" class="headerlink" title="六、参考链接"></a>六、参考链接</h2><ul>
<li>1、<a href="https://github.com/apache/airflow/tree/2.1.4" target="_blank" rel="external">https://github.com/apache/airflow/tree/2.1.4</a></li>
<li>2、<a href="https://airflow.apache.org/docs/helm-chart/1.2.0/index.html" target="_blank" rel="external">https://airflow.apache.org/docs/helm-chart/1.2.0/index.html</a></li>
</ul>
<h2 id="一、部署要求"><a href="#一、部署要求" class="headerlink" title="一、部署要求"></a>一、部署要求</h2><p>Apache Airflow 已通过以下测试:</p>
<table>
<thead>
<tr>
<th></th>
<th>Main version (dev)</th>
<th>Stable version (2.1.4)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Python</td>
<td>3.6, 3.7, 3.8, 3.9</td>
<td>3.6, 3.7, 3.8, 3.9</td>
</tr>
<tr>
<td>Kubernetes</td>
<td>1.20, 1.19, 1.18</td>
<td>1.20, 1.19, 1.18</td>
</tr>
<tr>
<td>PostgreSQL</td>
<td>9.6, 10, 11, 12, 13</td>
<td>9.6, 10, 11, 12, 13</td>
</tr>
<tr>
<td>MySQL</td>
<td>5.7, 8</td>
<td>5.7, 8</td>
</tr>
<tr>
<td>SQLite</td>
<td>3.15.0+</td>
<td>3.15.0+</td>
</tr>
<tr>
<td>MSSQL(Experimental)</td>
<td>2017,2019</td>
</tr>
</tbody>
</table>
<p><strong>注意:</strong> MySQL 5.x 版本不能或有运行多个调度程序的限制——请参阅调度程序文档。MariaDB 未经过测试/推荐。</p>
<p><strong>注意:</strong> SQLite 用于 Airflow 测试。不要在生产中使用它。我们建议使用最新的 SQLite 稳定版本进行本地开发。</p>
<blockquote>
<p>PS:本文部署 <code>Airflow</code> 稳定版 <code>2.1.4</code>,<code>Kubernetes</code>使用<code>1.20.x</code>版本,<code>PostgreSQL</code>使用<code>12.x</code>,使用<code>Helm Charts</code>部署。</p>
</blockquote>
Ingress Nginx传递用户真实IP问题
https://www.yp14.cn/2021/10/30/Ingress-Nginx传递用户真实IP问题/
2021-10-30T05:38:11.000Z
2021-10-30T05:38:49.754Z
<h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p>业务应用经常有需要用到用户真实ip的场景,比如:异地登录的风险预警、访问用户分布统计等功能等。当有这种需求的时候,在业务上容器过程中,如果用到ingress就要注意配置了。通常,用户ip的传递依靠的是<code>X-Forwarded-*</code>参数。但是默认情况下,ingress是没有开启的。</p>
<p>ingress的文档 <a href="https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration" target="_blank" rel="external">https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration</a> 还比较详细,这里介绍一下用到的3个参数: </p>
<p><img src="https://cdm.yp14.cn/img1/20200608132630839.jpeg" alt=""></p>
<blockquote>
<p>注:在文档顶栏的搜索框搜索forward字样就可以找到这3个参数</p>
</blockquote>
<a id="more"></a>
<h2 id="1-use-forwarded-headers"><a href="#1-use-forwarded-headers" class="headerlink" title="1. use-forwarded-headers"></a>1. use-forwarded-headers</h2><ul>
<li>如果Nginx在其他7层代理或负载均衡后面,当期望Nginx将<code>X-Forwarded-*</code>的头信息传递给后端服务时,则需要将此参数设为true</li>
<li>如果设为false(默认为false),Nginx会忽略掉<code>X-Forwarded-*</code>的头信息。false设置适用于Nginx直接对外或前面只有3层负载均衡的场景</li>
</ul>
<p>由于ingress的主配置是从configmap中获取的,更新参数则需要修改名为nginx-configuration的configmap的配置:在data配置块下添加<code>use-forwarded-headers: "true"</code></p>
<p>修改后,ingress nginx会自动加载更新nginx.conf主配置文件。下图为更新前后配置文件变化对比:</p>
<p><img src="https://cdm.yp14.cn/img1/20200608133126879.jpeg" alt=""></p>
<blockquote>
<p>注:左边为开启use-forwarded-headers后ingress nginx主配置文件,右边为开启前</p>
</blockquote>
<h2 id="2-forwarded-for-header"><a href="#2-forwarded-for-header" class="headerlink" title="2. forwarded-for-header"></a>2. forwarded-for-header</h2><p>用来设置识别客户端来源真实ip的字段,默认是<code>X-Forwarded-For</code>。如果想修改为自定义的字段名,则可以在configmap的data配置块下添加:<code>forwarded-for-header: "THE_NAME_YOU_WANT"</code>。通常情况下,我们使用默认的字段名就满足需求,所以不用对这个字段进行额外配置。</p>
<h2 id="3-compute-full-forwarded-for"><a href="#3-compute-full-forwarded-for" class="headerlink" title="3. compute-full-forwarded-for"></a>3. compute-full-forwarded-for</h2><p>如果只是开启了<code>use-forwarded-headers: "true"</code>的话,会发现还是没能获取到客户端来源的真实ip,原因是当前X-Forwarded-For变量是从remote_addr获取的值,每次取到的都是最近一层代理的ip。为了解决这个问题,就要配置compute-full-forwarded-for字段了,即在configmap的data配置块添加:<code>compute-full-forwarded-for: "true"</code>。其作用就是,将客户端用户访问所经过的代理ip按逗号连接的列表形式记录下来。</p>
<p>待ingress nginx加载configmap并更新主配置文件后,对比更新前后变化如下:</p>
<p><img src="https://cdm.yp14.cn/img1/20200608133445261.jpeg" alt=""></p>
<p><img src="https://cdm.yp14.cn/img1/20200608133514483.jpeg" alt=""></p>
<blockquote>
<p>注:左边是未开启compute-full-forwarded-for配置的ingress nginx主配置文件,右边是开启了的</p>
</blockquote>
<h2 id="举例说明"><a href="#举例说明" class="headerlink" title="举例说明"></a>举例说明</h2><p>如果从客户端ip0发起一个HTTP请求到达服务器之前,经过了三个代理proxy1、proxy2、proxy3,对应的ip分别为ip1、ip2、ip3,那么服务端最后得到的X-Forwarded-For值为:ip0,ip1,ip2。列表中并没有ip3,ip3可以在服务端通过remote_addr来获得。这样应用程序通过获取X-Forwarded-For字段的第一个ip,就可以得到客户端用户真实ip了。</p>
<h2 id="注意项"><a href="#注意项" class="headerlink" title="注意项"></a>注意项</h2><p>值得注意的是,并不是所有的场景都能通过X-Forwarded-For来获取用户正式ip。<br>比如,当服务器前端使用了CDN的时候,X-Forwarded-For方式获取到的可能就是CDN的来源ip了,<br>这种情况,可以根CDN厂商约定一个字段名来记录用户真实ip,然后代理将这个字段逐层传递,最后到服务端。</p>
<blockquote>
<ul>
<li>作者:felix_yujing</li>
<li>原文链接:<a href="https://blog.csdn.net/felix_yujing/article/details/106616962" target="_blank" rel="external">https://blog.csdn.net/felix_yujing/article/details/106616962</a></li>
</ul>
</blockquote>
<h2 id="背景"><a href="#背景" class="headerlink" title="背景"></a>背景</h2><p>业务应用经常有需要用到用户真实ip的场景,比如:异地登录的风险预警、访问用户分布统计等功能等。当有这种需求的时候,在业务上容器过程中,如果用到ingress就要注意配置了。通常,用户ip的传递依靠的是<code>X-Forwarded-*</code>参数。但是默认情况下,ingress是没有开启的。</p>
<p>ingress的文档 <a href="https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration">https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration</a> 还比较详细,这里介绍一下用到的3个参数: </p>
<p><img src="https://cdm.yp14.cn/img1/20200608132630839.jpeg" alt=""></p>
<blockquote>
<p>注:在文档顶栏的搜索框搜索forward字样就可以找到这3个参数</p>
</blockquote>
Kubectl 高亮输出
https://www.yp14.cn/2021/10/13/Kubectl-高亮输出/
2021-10-13T11:58:28.000Z
2021-10-13T12:01:15.417Z
<h2 id="kubecolor-带有色彩输出"><a href="#kubecolor-带有色彩输出" class="headerlink" title="kubecolor 带有色彩输出"></a>kubecolor 带有色彩输出</h2><ul>
<li>获取 kubernetes node 节点信息</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/95733375-04929680-0cbd-11eb-82f3-adbcfecf4a3e.png" alt=""></p>
<ul>
<li>显示 kubernetes pods 详细信息</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/95733389-08beb400-0cbd-11eb-983b-cf5138277fe3.png" alt=""></p>
<a id="more"></a>
<ul>
<li>更换背影颜色主题</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/95733403-0c523b00-0cbd-11eb-9ff9-abc5469e97ca.png" alt=""></p>
<p>从上面来看,带有色彩输出比没有带色彩输出看的更舒服些。</p>
<h2 id="Kubecolor-如何运行?"><a href="#Kubecolor-如何运行?" class="headerlink" title="Kubecolor 如何运行?"></a>Kubecolor 如何运行?</h2><p>kubecolor 为 kubectl 命令输出着色,不执行任何其他操作。kubecolor 在内部调用 <code>kubectl command</code> 并尝试对输出进行着色。</p>
<h2 id="Kubecolor-安装"><a href="#Kubecolor-安装" class="headerlink" title="Kubecolor 安装"></a>Kubecolor 安装</h2><ul>
<li>二进制文件安装</li>
</ul>
<p>打开 <code>https://github.com/dty1er/kubecolor/releases</code> 页面,下载相应的二进制文件,下载文件后,把文件放到 <code>/usr/local/bin</code> 目录下,并把文件添加执行权限。</p>
<ul>
<li>Mac 安装</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ brew install dty1er/tap/kubecolor</span><br></pre></td></tr></table></figure>
<h2 id="Kubecolor-用法"><a href="#Kubecolor-用法" class="headerlink" title="Kubecolor 用法"></a>Kubecolor 用法</h2><p>如果习惯使用 kubectl,可以把 kubecolor 命令做一个 kubectl 别名。具体在 .bash_profile 文件中配置,下面是具体配置。kubecolor 使用和 kubectl 命令方法一样。</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">alias</span> kubectl=<span class="string">"kubecolor"</span></span><br></pre></td></tr></table></figure>
<p>当 kubecolor 输出 tty 不是标准输出时,它会自动禁用着色。例如,如果您正在运行 <code>kubecolor get pods > result.txt</code> 或 <code>kubecolor get pods | grep xxx</code>,则输出将传递到文件或其它命令,因此不会着色。在这种情况下,您可以通过传递 <code>--force-colors</code> 标志来强制 kubecolor 进行着色。</p>
<blockquote>
<p>项目地址:<a href="https://github.com/dty1er/kubecolor" target="_blank" rel="external">https://github.com/dty1er/kubecolor</a></p>
</blockquote>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="https://github.com/dty1er/kubecolor" target="_blank" rel="external">https://github.com/dty1er/kubecolor</a></li>
</ul>
<h2 id="kubecolor-带有色彩输出"><a href="#kubecolor-带有色彩输出" class="headerlink" title="kubecolor 带有色彩输出"></a>kubecolor 带有色彩输出</h2><ul>
<li>获取 kubernetes node 节点信息</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/95733375-04929680-0cbd-11eb-82f3-adbcfecf4a3e.png" alt=""></p>
<ul>
<li>显示 kubernetes pods 详细信息</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/95733389-08beb400-0cbd-11eb-983b-cf5138277fe3.png" alt=""></p>
聊聊TPS、QPS、CPS概念和区别.md
https://www.yp14.cn/2021/07/29/聊聊TPS、QPS、CPS概念和区别-md/
2021-07-29T03:46:01.000Z
2021-07-29T03:46:31.315Z
<h2 id="TPS-概念"><a href="#TPS-概念" class="headerlink" title="TPS 概念"></a>TPS 概念</h2><p><code>TPS</code>:是<code>TransactionsPerSecond</code>的缩写,也就是事务数/秒。它是软件测试结果的测量单位。一个事务是指一个客户机向服务器发送请求然后服务器做出反应的过程。客户机在发送请求时开始计时,收到服务器响应后结束计时,以此来计算使用的时间和完成的事务个数。</p>
<h2 id="QPS-概念"><a href="#QPS-概念" class="headerlink" title="QPS 概念"></a>QPS 概念</h2><p><code>QPS</code>:<code>Queries Per Second</code>意思是<code>每秒查询率</code>,是一台服务器每秒能够相应的查询次数,是对一个特定的查询服务器在规定时间内所处理流量多少的衡量标准。</p>
<h2 id="CPS-概念"><a href="#CPS-概念" class="headerlink" title="CPS 概念"></a>CPS 概念</h2><p><code>CPS</code>:<code>Connection Per Second</code>意思是<code>每秒新建连接数</code>,定义了新建连接的速率。当新建连接的速率超过规格定义的每秒新建连接数时,新建连接请求将被丢弃。</p>
<a id="more"></a>
<h2 id="TPS-与-QPS-区别"><a href="#TPS-与-QPS-区别" class="headerlink" title="TPS 与 QPS 区别"></a>TPS 与 QPS 区别</h2><p>TPS 即每秒处理事务数,包括以下部分:</p>
<ul>
<li>1、用户请求服务器</li>
<li>2、服务器自己的内部处理</li>
<li>3、服务器返回给用户</li>
</ul>
<p>这三个过程,每秒能够完成N个这三个过程,TPS也就是N。</p>
<p><code>QPS</code> 基本类似于TPS,但是不同的是,对于一个页面的一次访问,形成一个TPS。但一次页面请求,可能产生多次对服务器的请求,服务器对这些请求,就可计入<code>QPS</code>之中。</p>
<p>例如:访问一个页面会请求服务器3次,一次访问,产生一个<code>“T”</code>,产生3个<code>“Q”</code>。</p>
<h2 id="QPS-计算公式"><a href="#QPS-计算公式" class="headerlink" title="QPS 计算公式"></a>QPS 计算公式</h2><p>每秒查询率QPS是对一个特定的查询服务器在规定时间内所处理流量多少的衡量标准,在因特网上,作为域名系统服务器的机器的性能经常用每秒查询率来衡量。</p>
<ul>
<li><code>原理</code>:每天80%的访问集中在20%的时间里,这20%时间叫做峰值时间</li>
<li><code>公式</code>:( 总PV数 <em> 80% ) / ( 每天秒数 </em> 20% ) = 峰值时间每秒请求数(QPS)</li>
<li><code>机器</code>:峰值时间每秒QPS / 单台机器的QPS = 需要的机器</li>
</ul>
<p>问:每天300w PV 的在单台机器上,这台机器需要多少QPS?</p>
<p>答:( 3000000 <em> 0.8 ) / (86400 </em> 0.2 ) = 139 (QPS)</p>
<p>问:如果一台机器的QPS是58,需要几台机器来支持?</p>
<p>答:139 / 58 = 3</p>
<h2 id="系统吞吐量"><a href="#系统吞吐量" class="headerlink" title="系统吞吐量"></a>系统吞吐量</h2><p>一个系统的吞度量(承压能力)与request对<code>CPU的消耗</code>、<code>外部接口</code>、<code>IO</code>等等紧密关联。单个reqeust 对CPU消耗越高,外部系统接口、IO影响速度越慢,系统吞吐能力越低,反之越高。</p>
<p>系统吞吐量几个重要参数:<code>QPS(TPS)</code>、<code>并发数</code>、<code>响应时间</code></p>
<ul>
<li>QPS(TPS):每秒钟request/事务 数量</li>
<li>并发数:系统同时处理的request/事务数</li>
<li>响应时间:一般取平均响应时间</li>
</ul>
<p>理解了上面三个要素的意义之后,就能推算出它们之间的关系:</p>
<p>QPS(TPS)= 并发数/平均响应时间 或者 并发数 = QPS*平均响应时间</p>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="https://blog.csdn.net/u010889616/article/details/83245695" target="_blank" rel="external">https://blog.csdn.net/u010889616/article/details/83245695</a></li>
<li><a href="https://blog.csdn.net/yanyuan_smartisan/article/details/112871685" target="_blank" rel="external">https://blog.csdn.net/yanyuan_smartisan/article/details/112871685</a></li>
</ul>
<h2 id="TPS-概念"><a href="#TPS-概念" class="headerlink" title="TPS 概念"></a>TPS 概念</h2><p><code>TPS</code>:是<code>TransactionsPerSecond</code>的缩写,也就是事务数/秒。它是软件测试结果的测量单位。一个事务是指一个客户机向服务器发送请求然后服务器做出反应的过程。客户机在发送请求时开始计时,收到服务器响应后结束计时,以此来计算使用的时间和完成的事务个数。</p>
<h2 id="QPS-概念"><a href="#QPS-概念" class="headerlink" title="QPS 概念"></a>QPS 概念</h2><p><code>QPS</code>:<code>Queries Per Second</code>意思是<code>每秒查询率</code>,是一台服务器每秒能够相应的查询次数,是对一个特定的查询服务器在规定时间内所处理流量多少的衡量标准。</p>
<h2 id="CPS-概念"><a href="#CPS-概念" class="headerlink" title="CPS 概念"></a>CPS 概念</h2><p><code>CPS</code>:<code>Connection Per Second</code>意思是<code>每秒新建连接数</code>,定义了新建连接的速率。当新建连接的速率超过规格定义的每秒新建连接数时,新建连接请求将被丢弃。</p>
K8S Configmap和Secret热更新之Reloader
https://www.yp14.cn/2021/07/24/K8S-Configmap和Secret热更新之Reloader/
2021-07-24T04:16:01.000Z
2021-07-24T04:16:38.057Z
<h1 id="一-背景"><a href="#一-背景" class="headerlink" title="一 背景"></a>一 背景</h1><h2 id="1-1-配置中心问题"><a href="#1-1-配置中心问题" class="headerlink" title="1.1 配置中心问题"></a>1.1 配置中心问题</h2><p>在云原生中配置中心,例如:<code>Configmap</code>和<code>Secret</code>对象,虽然可以进行直接更新资源对象</p>
<ul>
<li>对于引用这些有些不变的配置是可以打包到镜像中的,那可变的配置呢?</li>
<li>信息泄漏,很容易引发安全风险,尤其是一些敏感信息,比如密码、密钥等。</li>
<li>每次配置更新后,都要重新打包一次,升级应用。镜像版本过多,也给镜像管理和镜像中心存储带来很大的负担。</li>
<li>定制化太严重,可扩展能力差,且不容易复用。</li>
</ul>
<h2 id="1-2-使用方式"><a href="#1-2-使用方式" class="headerlink" title="1.2 使用方式"></a>1.2 使用方式</h2><p><code>Configmap</code>或<code>Secret</code>使用有两种方式,一种是<code>env</code>系统变量赋值,一种是<code>volume</code>挂载赋值,env写入系统的configmap是不会热更新的,而volume写入的方式支持热更新!</p>
<ul>
<li>对于env环境的,必须要滚动更新pod才能生效,也就是删除老的pod,重新使用镜像拉起新pod加载环境变量才能生效。</li>
<li>对于volume的方式,虽然内容变了,但是需要我们的应用直接监控configmap的变动,或者一直去更新环境变量才能在这种情况下达到热更新的目的。</li>
</ul>
<p>应用不支持热更新,可以在业务容器中启动一个sidercar容器,监控configmap的变动,更新配置文件,或者也滚动更新pod达到更新配置的效果。</p>
<a id="more"></a>
<h1 id="二-解决方案"><a href="#二-解决方案" class="headerlink" title="二 解决方案"></a>二 解决方案</h1><p>ConfigMap 和 Secret 是 Kubernetes 常用的保存配置数据的对象,你可以根据需要选择合适的对象存储数据。通过 Volume 方式挂载到 Pod 内的,kubelet 都会定期进行更新。但是通过环境变量注入到容器中,这样无法感知到 ConfigMap 或 Secret 的内容更新。</p>
<p>目前如何让 Pod 内的业务感知到 ConfigMap 或 Secret 的变化,还是一个待解决的问题。但是我们还是有一些 Workaround 的。</p>
<p>如果业务自身支持 reload 配置的话,比如nginx -s reload,可以通过 inotify 感知到文件更新,或者直接定期进行 reload(这里可以配合我们的 readinessProbe 一起使用)。</p>
<p>如果我们的业务没有这个能力,考虑到不可变基础设施的思想,我们是不是可以采用滚动升级的方式进行?没错,这是一个非常好的方法。目前有个开源工具Reloader,它就是采用这种方式,通过 watch ConfigMap 和 Secret,一旦发现对象更新,就自动触发对 Deployment 或 StatefulSet 等工作负载对象进行滚动升级。</p>
<h1 id="三-reloader简介"><a href="#三-reloader简介" class="headerlink" title="三 reloader简介"></a>三 reloader简介</h1><h2 id="3-1-reloader简介"><a href="#3-1-reloader简介" class="headerlink" title="3.1 reloader简介"></a>3.1 reloader简介</h2><p><code>Reloader</code> 可以观察 ConfigMap 和 Secret 中的变化,并通过相关的 deploymentconfiggs、 deploymentconfiggs、 deploymonset 和 statefulset 对 Pods 进行滚动升级。</p>
<h2 id="3-2-reloader安装"><a href="#3-2-reloader安装" class="headerlink" title="3.2 reloader安装"></a>3.2 reloader安装</h2><ul>
<li>helm安装</li>
</ul>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ helm repo add stakater https://stakater.github.io/stakater-charts</span><br><span class="line">$ helm repo update</span><br><span class="line">$ helm install stakater/reloader</span><br></pre></td></tr></table></figure>
<ul>
<li>Kustomize</li>
</ul>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl apply -k https://github.com/stakater/Reloader/deployments/kubernetes</span><br></pre></td></tr></table></figure>
<ul>
<li>资源清单安装</li>
</ul>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl apply -f https://raw.githubusercontent.com/stakater/Reloader/master/deployments/kubernetes/reloader.yaml</span><br><span class="line"></span><br><span class="line">clusterrole.rbac.authorization.k8s.io/reloader-reloader-role created</span><br><span class="line">clusterrolebinding.rbac.authorization.k8s.io/reloader-reloader-role-binding created</span><br><span class="line">deployment.apps/reloader-reloader created</span><br><span class="line">serviceaccount/reloader-reloader created</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">NAME READY STATUS RESTARTS AGE</span><br><span class="line">pod/reloader-reloader<span class="bullet">-66</span>d46d5885-nx64t <span class="number">1</span>/<span class="number">1</span> Running <span class="number">0</span> <span class="number">15</span>s</span><br><span class="line"></span><br><span class="line">NAME READY UP-TO-DATE AVAILABLE AGE</span><br><span class="line">deployment.apps/reloader-reloader <span class="number">1</span>/<span class="number">1</span> <span class="number">1</span> <span class="number">1</span> <span class="number">16</span>s</span><br><span class="line"></span><br><span class="line">NAME DESIRED CURRENT READY AGE</span><br><span class="line">replicaset.apps/reloader-reloader<span class="bullet">-66</span>d46d5885 <span class="number">1</span> <span class="number">1</span> <span class="number">1</span> <span class="number">16</span>s</span><br></pre></td></tr></table></figure>
<p><code>reloader</code> 能够配置忽略cm或者secrets资源,可以通过配置在reader deploy中的spec.template.spec.containers.args,如果两个都忽略,那就缩小deploy为0,或者不部署reoader。</p>
<table>
<thead>
<tr>
<th>Args</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>--resources-to-ignore=configMaps</td>
<td>To ignore configMaps</td>
</tr>
<tr>
<td>--resources-to-ignore=secrets</td>
<td>To ignore secrets</td>
</tr>
</tbody>
</table>
<h2 id="3-3-配置"><a href="#3-3-配置" class="headerlink" title="3.3 配置"></a>3.3 配置</h2><h3 id="3-3-1-自动更新"><a href="#3-3-1-自动更新" class="headerlink" title="3.3.1 自动更新"></a>3.3.1 自动更新</h3><p><code>reloader.stakater.com/search</code> 和 <code>reloader.stakater.com/auto</code> 并不在一起工作。如果你在你的部署上有一个 reloader.stakater.com/auto : “true”的注释,该资源对象引用的所有configmap或这secret的改变都会重启该资源,不管他们是否有 reloader.stakater.com/match : “true”的注释。</p>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">kind:</span> Deployment</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> reloader.stakater.com/auto: <span class="string">"true"</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> template:</span> metadata:</span><br></pre></td></tr></table></figure>
<h3 id="3-3-2-制定更新"><a href="#3-3-2-制定更新" class="headerlink" title="3.3.2 制定更新"></a>3.3.2 制定更新</h3><p>指定一个特定的configmap或者secret,只有在我们指定的配置图或秘密被改变时才会触发滚动升级,这样,它不会触发滚动升级所有配置图或秘密在部署,后台登录或状态设置中使用。</p>
<p>一个制定deployment资源对象,在引用的configmap或者secret种,只有<code>reloader.stakater.com/match: "true"</code>为true才会出发更新,为false或者不进行标记,该资源对象都不会监视配置的变化而重启。</p>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">kind:</span> Deployment</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> reloader.stakater.com/search: <span class="string">"true"</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> template:</span></span><br></pre></td></tr></table></figure>
<p>cm配置</p>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">kind:</span> ConfigMap</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> reloader.stakater.com/match: <span class="string">"true"</span></span><br><span class="line"><span class="attr">data:</span></span><br><span class="line"><span class="attr"> key:</span> value</span><br></pre></td></tr></table></figure>
<h3 id="3-3-3-指定cm"><a href="#3-3-3-指定cm" class="headerlink" title="3.3.3 指定cm"></a>3.3.3 指定cm</h3><p>如果一个deployment挂载有多个cm或者的场景下,我们只希望更新特定一个cm后,deploy发生滚动更新,更新其他的cm,deploy不更新,这种场景可以将cm在deploy中指定为单个或着列表实现。</p>
<p>例如:一个deploy有挂载nginx-cm1和nginx-cm2两个configmap,只想nginx-cm1更新的时候deploy才发生滚动更新,此时无需在两个cm中配置注解,只需要在deploy中写入<code>configmap.reloader.stakater.com/reload:nginx-cm1</code>,其中nginx-cm1如果发生更新,deploy就会触发滚动更新。</p>
<p>如果多个cm直接用逗号隔开</p>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># configmap对象</span></span><br><span class="line"><span class="attr">kind:</span> Deployment</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> configmap.reloader.stakater.com/reload: <span class="string">"nginx-cm1"</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> template:</span> metadata:</span><br></pre></td></tr></table></figure>
<figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># secret对象</span></span><br><span class="line"><span class="attr">kind:</span> Deployment</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> secret.reloader.stakater.com/reload: <span class="string">"foo-secret"</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> template:</span> metadata:</span><br></pre></td></tr></table></figure>
<blockquote>
<p>无需在cm或secret中添加注解,只需要在引用资源对象中添加注解即可。</p>
</blockquote>
<h1 id="四-测试"><a href="#四-测试" class="headerlink" title="四 测试"></a>四 测试</h1><h2 id="4-1-deploy"><a href="#4-1-deploy" class="headerlink" title="4.1 deploy"></a>4.1 deploy</h2><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> apps/v1</span><br><span class="line"><span class="attr">kind:</span> Deployment</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> annotations:</span> </span><br><span class="line"> reloader.stakater.com/search: <span class="string">"true"</span></span><br><span class="line"><span class="attr"> labels:</span></span><br><span class="line"><span class="attr"> run:</span> nginx</span><br><span class="line"><span class="attr"> name:</span> nginx</span><br><span class="line"><span class="attr"> namespace:</span> default</span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line"><span class="attr"> replicas:</span> <span class="number">1</span></span><br><span class="line"><span class="attr"> selector:</span></span><br><span class="line"><span class="attr"> matchLabels:</span></span><br><span class="line"><span class="attr"> run:</span> nginx</span><br><span class="line"><span class="attr"> template:</span></span><br><span class="line"><span class="attr"> metadata:</span></span><br><span class="line"><span class="attr"> labels:</span></span><br><span class="line"><span class="attr"> run:</span> nginx</span><br><span class="line"><span class="attr"> spec:</span></span><br><span class="line"><span class="attr"> containers:</span></span><br><span class="line"><span class="attr"> - image:</span> nginx</span><br><span class="line"><span class="attr"> name:</span> nginx</span><br><span class="line"><span class="attr"> volumeMounts:</span> </span><br><span class="line"><span class="attr"> - name:</span> nginx-cm</span><br><span class="line"><span class="attr"> mountPath:</span> /data/cfg</span><br><span class="line"><span class="attr"> readOnly:</span> <span class="literal">true</span></span><br><span class="line"><span class="attr"> volumes:</span> </span><br><span class="line"><span class="attr"> - name:</span> nginx-cm</span><br><span class="line"><span class="attr"> configMap:</span> </span><br><span class="line"><span class="attr"> name:</span> nginx-cm</span><br><span class="line"><span class="attr"> items:</span> </span><br><span class="line"><span class="attr"> - key:</span> config.yaml </span><br><span class="line"><span class="attr"> path:</span> config.yaml</span><br><span class="line"><span class="attr"> mode:</span> <span class="number">0644</span></span><br></pre></td></tr></table></figure>
<h2 id="4-2-configmap"><a href="#4-2-configmap" class="headerlink" title="4.2 configmap"></a>4.2 configmap</h2><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> v1</span><br><span class="line"><span class="attr">data:</span></span><br><span class="line"> config.yaml: <span class="string">|</span><br><span class="line"> # project settings</span><br><span class="line"></span><span class="attr"> DEFAULT_CONF:</span></span><br><span class="line"><span class="attr"> port:</span> <span class="number">8888</span> </span><br><span class="line"><span class="attr"> UNITTEST_TENCENT_ZONE:</span> ap-chongqing<span class="bullet">-1</span></span><br><span class="line"><span class="attr">kind:</span> ConfigMap</span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line"><span class="attr"> name:</span> nginx-cm</span><br><span class="line"><span class="attr"> annotations:</span></span><br><span class="line"> reloader.stakater.com/match: <span class="string">"true"</span></span><br></pre></td></tr></table></figure>
<h2 id="4-3-测试"><a href="#4-3-测试" class="headerlink" title="4.3 测试"></a>4.3 测试</h2><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl get pod</span><br><span class="line">NAME READY STATUS RESTARTS AGE</span><br><span class="line">nginx-68c9bf4ff7-9gmg6 1/1 Running 0 10m</span><br><span class="line"></span><br><span class="line">$ kubectl get cm</span><br><span class="line">NAME DATA AGE</span><br><span class="line">nginx-cm 1 28m</span><br><span class="line"></span><br><span class="line"># 更新cm内容</span><br><span class="line"></span><br><span class="line">$ kubectl edit cm nginx-cm </span><br><span class="line"></span><br><span class="line">configmap/nginx-cm edited</span><br><span class="line"></span><br><span class="line"># 查看po发生了滚动更新,重新加载配置文件</span><br><span class="line">$ kubectl get pod</span><br><span class="line"></span><br><span class="line">NAME READY STATUS RESTARTS AGE</span><br><span class="line">nginx-66c758b548-9dllm 0/1 ContainerCreating 0 4s</span><br><span class="line">nginx-68c9bf4ff7-9gmg6 1/1 Running 0 10m</span><br></pre></td></tr></table></figure>
<h1 id="五-Reloader-使用注意事项"><a href="#五-Reloader-使用注意事项" class="headerlink" title="五 Reloader 使用注意事项"></a>五 Reloader 使用注意事项</h1><ul>
<li><p>Reloader 为全局资源对象,建议部署在一个公共服务的ns下,然后其他ns也可以正常使用reloader特性。</p>
</li>
<li><p>Reloader.stakater.com/auto : 如果配置configmap或者secret在 deploymentconfigmap/deployment/daemonsets/Statefulsets</p>
</li>
<li><p>secret.reloader.stakater.com/reload 或者 configmap.reloader.stakater.com/reload 注释中被使用,那么 true 只会重新加载 pod,不管使用的是 configmap 还是 secret。</p>
</li>
<li><p>reloader.stakater.com/search 和 reloader.stakater.com/auto 不能同时使用。如果你在你的部署上有一个 reloader.stakater.com/auto : “true”的注释,那么它总是会在你修改了 configmaps 或者使用了机密之后重新启动,不管他们是否有 reloader.stakater.com/match : “true”的注释。</p>
</li>
</ul>
<h1 id="六-反思"><a href="#六-反思" class="headerlink" title="六 反思"></a>六 反思</h1><p>Reloader通过 watch ConfigMap 和 Secret,一旦发现对象更新,就自动触发对 Deployment 或 StatefulSet 等工作负载对象进行滚动升级。</p>
<p>如果我们的应用内部没有去实时监控配置文件,利用该方式可以非常方便的实现配置的热更新。</p>
<h1 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h1><ul>
<li><a href="https://github.com/stakater/Reloader" target="_blank" rel="external">https://github.com/stakater/Reloader</a></li>
</ul>
<blockquote>
<ul>
<li>作者: kaliarch</li>
<li>原文出处: <a href="https://juejin.cn/post/6897882769624727559" target="_blank" rel="external">https://juejin.cn/post/6897882769624727559</a></li>
</ul>
</blockquote>
<h1 id="一-背景"><a href="#一-背景" class="headerlink" title="一 背景"></a>一 背景</h1><h2 id="1-1-配置中心问题"><a href="#1-1-配置中心问题" class="headerlink" title="1.1 配置中心问题"></a>1.1 配置中心问题</h2><p>在云原生中配置中心,例如:<code>Configmap</code>和<code>Secret</code>对象,虽然可以进行直接更新资源对象</p>
<ul>
<li>对于引用这些有些不变的配置是可以打包到镜像中的,那可变的配置呢?</li>
<li>信息泄漏,很容易引发安全风险,尤其是一些敏感信息,比如密码、密钥等。</li>
<li>每次配置更新后,都要重新打包一次,升级应用。镜像版本过多,也给镜像管理和镜像中心存储带来很大的负担。</li>
<li>定制化太严重,可扩展能力差,且不容易复用。</li>
</ul>
<h2 id="1-2-使用方式"><a href="#1-2-使用方式" class="headerlink" title="1.2 使用方式"></a>1.2 使用方式</h2><p><code>Configmap</code>或<code>Secret</code>使用有两种方式,一种是<code>env</code>系统变量赋值,一种是<code>volume</code>挂载赋值,env写入系统的configmap是不会热更新的,而volume写入的方式支持热更新!</p>
<ul>
<li>对于env环境的,必须要滚动更新pod才能生效,也就是删除老的pod,重新使用镜像拉起新pod加载环境变量才能生效。</li>
<li>对于volume的方式,虽然内容变了,但是需要我们的应用直接监控configmap的变动,或者一直去更新环境变量才能在这种情况下达到热更新的目的。</li>
</ul>
<p>应用不支持热更新,可以在业务容器中启动一个sidercar容器,监控configmap的变动,更新配置文件,或者也滚动更新pod达到更新配置的效果。</p>
Mysqldump导入备份数据到阿里云RDS会报错吗
https://www.yp14.cn/2021/06/20/Mysqldump导入备份数据到阿里云RDS会报错吗/
2021-06-20T09:55:37.000Z
2021-06-20T09:56:08.047Z
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>大家都知道,数据量小的备份都使用 mysqldump 命令来备份,最近本人从阿里云RDS实例备份博客数据,并再次把备份出来的数据导入到RDS实例时,会遇到错误 <code>[Err] 1227 - Access denied; you need (at least one of) the SUPER privilege(s) for this operation</code>。</p>
<blockquote>
<p>PS:阿里云RDS实例版本:<code>5.6</code></p>
</blockquote>
<p>遇到上面错误感觉很奇怪,为什么没有权限写入,使用的账号是高级账号,为什么没有权限了???</p>
<a id="more"></a>
<h2 id="错误原因"><a href="#错误原因" class="headerlink" title="错误原因"></a>错误原因</h2><p>通过上面报错,查找阿里云帮助文档,最后找到答案,下面是具体解决方法。</p>
<ul>
<li>导入RDS MySQL 实例:SQL 语句中含有需要 Supper 权限才可以执行的语句,而 RDS MySQL不提供 Super 权限,因此需要去除这类语句。</li>
<li>本地 MySQL 实例没有启用 GTID。</li>
</ul>
<h2 id="解决方法"><a href="#解决方法" class="headerlink" title="解决方法"></a>解决方法</h2><p>1、去除 DEFINER 子句</p>
<p>检查 SQL 文件,去除下面类似的子句</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">DEFINER=`root`@`%`</span><br></pre></td></tr></table></figure>
<p>在 Linux 平台下,可以尝试使用下面的语句去除:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ sed -e 's/DEFINER[ ]*=[ ]*[^*]*\*/\*/ ' your.sql > your_revised.sql</span><br></pre></td></tr></table></figure>
<p>2、去除 GTID_PURGED 子句</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">检查 SQL 文件,去除下面类似的语句</span><br></pre></td></tr></table></figure>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">SET @@GLOBAL.GTID_PURGED='d0502171-3e23-11e4-9d65-d89d672af420:1-373,</span><br><span class="line">d5deee4e-3e23-11e4-9d65-d89d672a9530:1-616234';</span><br></pre></td></tr></table></figure>
<p>在 Linux 平台,可以使用下面的语句去除</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ awk '{ if (index($0,"GTID_PURGED")) { getline; while (length($0) > 0) { getline; } } else { print $0 } }' your.sql | grep -iv 'set @@' > your_revised.sql</span><br></pre></td></tr></table></figure>
<p>3、检查修改后的文件</p>
<p>修改完毕后,通过下面的语句检查是否合乎要求。</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ egrep -in "definer|set @@" your_revised.sql</span><br></pre></td></tr></table></figure>
<blockquote>
<p>如果上面的语句没有输出,说明 SQL 文件符合要求。</p>
</blockquote>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li>阿里云文档:<a href="https://developer.aliyun.com/article/66463" target="_blank" rel="external">https://developer.aliyun.com/article/66463</a></li>
</ul>
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>大家都知道,数据量小的备份都使用 mysqldump 命令来备份,最近本人从阿里云RDS实例备份博客数据,并再次把备份出来的数据导入到RDS实例时,会遇到错误 <code>[Err] 1227 - Access denied; you need (at least one of) the SUPER privilege(s) for this operation</code>。</p>
<blockquote>
<p>PS:阿里云RDS实例版本:<code>5.6</code></p>
</blockquote>
<p>遇到上面错误感觉很奇怪,为什么没有权限写入,使用的账号是高级账号,为什么没有权限了???</p>
K8S集群内Pod如何与本地网络打通实现debug
https://www.yp14.cn/2021/06/06/K8S集群内Pod如何与本地网络打通实现debug/
2021-06-05T22:41:41.000Z
2021-06-05T22:42:33.502Z
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>大家都知道,在没有K8S集群时,我们能直接连接测试环境服务实现debug。随着K8S到来,我们无法直接连接业务服务dubug,K8S Pod 分配的IP地址是集群内部网络,集群外部网络是无法直接访问到Pod,那有什么好的解决方法能直接连接Pod?下面介绍下开源 <code>Telepresence</code>。</p>
<h2 id="Telepresence-简介"><a href="#Telepresence-简介" class="headerlink" title="Telepresence 简介"></a>Telepresence 简介</h2><p>Telepresence 是一种开源工具,可让您在本地运行单个服务,同时将该服务连接到远程 Kubernetes 集群。这使开发 multi-service 应用程序的开发人员能够:</p>
<ul>
<li>对单个服务进行快速本地开发,即使该服务依赖于集群中的其他服务。对您的服务进行更改并保存,您可以立即看到正在运行的新服务。</li>
<li>使用本地安装的任何工具来 测试/调试/编辑 您的服务。例如,您可以使用调试器或 IDE!</li>
<li>让您的本地开发机器像 Kubernetes 集群的一部分一样运行。如果您的机器上有一个应用程序要针对集群中的服务运行——这很容易做到。</li>
</ul>
<blockquote>
<p>开源地址: <a href="https://github.com/telepresenceio/telepresence" target="_blank" rel="external">https://github.com/telepresenceio/telepresence</a></p>
</blockquote>
<a id="more"></a>
<h2 id="Telepresence-如何运行"><a href="#Telepresence-如何运行" class="headerlink" title="Telepresence 如何运行"></a>Telepresence 如何运行</h2><p>Telepresence 在 Kubernetes 集群中运行的 pod 中部署双向网络代理。此 pod 将数据从您的 Kubernetes 环境(例如 TCP 连接、环境变量、卷)代理到本地进程。本地进程的网络被透明覆盖,以便 DNS 调用和 TCP 连接通过代理路由到远程 Kubernetes 集群。</p>
<p>这种方法给出:</p>
<ul>
<li>您的本地服务可以完全访问远程集群中的其他服务</li>
<li>您的本地服务对 Kubernetes <code>environment</code>、<code>secrets</code>和 <code>ConfigMap</code> 的完全访问权限</li>
<li>您的远程服务可以完全访问您的本地服务</li>
</ul>
<h2 id="Telepresence-支持的运行平台"><a href="#Telepresence-支持的运行平台" class="headerlink" title="Telepresence 支持的运行平台"></a>Telepresence 支持的运行平台</h2><ul>
<li>Mac OS X</li>
<li>Linux</li>
</ul>
<h2 id="Telepresence-安装"><a href="#Telepresence-安装" class="headerlink" title="Telepresence 安装"></a>Telepresence 安装</h2><p>可使用 Homebrew、apt 或 dnf 安装</p>
<h2 id="Telepresence-使用报告"><a href="#Telepresence-使用报告" class="headerlink" title="Telepresence 使用报告"></a>Telepresence 使用报告</h2><p>Telepresence 收集有关其用户的一些基本信息,以便它可以发送重要的客户通知,例如新版本可用性和安全公告。我们还使用这些信息匿名汇总基本使用情况分析。要禁用此行为,请设置环境变量 <code>SCOUT_DISABLE</code>:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">export</span> SCOUT_DISABLE=1</span><br></pre></td></tr></table></figure>
<h2 id="Telepresence-使用方法"><a href="#Telepresence-使用方法" class="headerlink" title="Telepresence 使用方法"></a>Telepresence 使用方法</h2><p>这里不在描述,具体参考 <a href="https://www.telepresence.io/tutorials/kubernetes" target="_blank" rel="external">https://www.telepresence.io/tutorials/kubernetes</a></p>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="https://github.com/telepresenceio/telepresence" target="_blank" rel="external">https://github.com/telepresenceio/telepresence</a></li>
<li><a href="https://www.telepresence.io/discussion/overview" target="_blank" rel="external">https://www.telepresence.io/discussion/overview</a></li>
</ul>
<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>大家都知道,在没有K8S集群时,我们能直接连接测试环境服务实现debug。随着K8S到来,我们无法直接连接业务服务dubug,K8S Pod 分配的IP地址是集群内部网络,集群外部网络是无法直接访问到Pod,那有什么好的解决方法能直接连接Pod?下面介绍下开源 <code>Telepresence</code>。</p>
<h2 id="Telepresence-简介"><a href="#Telepresence-简介" class="headerlink" title="Telepresence 简介"></a>Telepresence 简介</h2><p>Telepresence 是一种开源工具,可让您在本地运行单个服务,同时将该服务连接到远程 Kubernetes 集群。这使开发 multi-service 应用程序的开发人员能够:</p>
<ul>
<li>对单个服务进行快速本地开发,即使该服务依赖于集群中的其他服务。对您的服务进行更改并保存,您可以立即看到正在运行的新服务。</li>
<li>使用本地安装的任何工具来 测试/调试/编辑 您的服务。例如,您可以使用调试器或 IDE!</li>
<li>让您的本地开发机器像 Kubernetes 集群的一部分一样运行。如果您的机器上有一个应用程序要针对集群中的服务运行——这很容易做到。</li>
</ul>
<blockquote>
<p>开源地址: <a href="https://github.com/telepresenceio/telepresence">https://github.com/telepresenceio/telepresence</a></p>
</blockquote>
Harbor多实例高可用共享存储搭建
https://www.yp14.cn/2021/05/16/Harbor多实例高可用共享存储搭建/
2021-05-16T11:14:17.000Z
2021-05-16T11:17:04.672Z
<h2 id="多实例共享存储架构图"><a href="#多实例共享存储架构图" class="headerlink" title="多实例共享存储架构图"></a>多实例共享存储架构图</h2><p><img src="https://cdm.yp14.cn/img1/0806e2fb37bf7b39ac53f83018a4f47e.png" alt=""></p>
<p>本文 LB 不使用 Nginx,使用阿里SLB。</p>
<p><code>更多 Harbor 架构</code>,请参考 <a href="https://www.yp14.cn/2021/05/09/%E8%81%8A%E8%81%8AHarbor%E6%9E%B6%E6%9E%84/">聊聊Harbor架构</a></p>
<h2 id="本文架构需要考虑三个问题"><a href="#本文架构需要考虑三个问题" class="headerlink" title="本文架构需要考虑三个问题"></a>本文架构需要考虑三个问题</h2><ul>
<li>1、共享存储的选取,Harbor的后端存储目前支持<code>AWS S3</code>、<code>Openstack Swift</code>, <code>Ceph</code>等。本文使用<code>阿里云极速性NAS</code>,磁盘IO性能比单块磁盘读写性能要好。使用 NFS V3 版本挂载。</li>
<li>2、Session 不能在不同的实例上共享,所以Harbor Redis 需要单独部署,并且多个实例连接相同的Redis。</li>
<li>3、Harbor多实例数据库问题,必须单独部署一个数据库,并且多个实例连接相同的数据库。</li>
</ul>
<blockquote>
<p>注意:生产环境如果使用阿里云NAS,推荐使用 <code>极速性NAS</code>,不推荐使用 <code>通用型NAS</code>。</p>
</blockquote>
<p>阿里云NAS性能参考文档 <a href="https://help.aliyun.com/document_detail/124577.html?spm=a2c4g.11186623.6.552.2eb05ea0HJUgUB" target="_blank" rel="external">https://help.aliyun.com/document_detail/124577.html?spm=a2c4g.11186623.6.552.2eb05ea0HJUgUB</a></p>
<a id="more"></a>
<h2 id="部署资源"><a href="#部署资源" class="headerlink" title="部署资源"></a>部署资源</h2><table>
<thead>
<tr>
<th>主机名</th>
<th>IP地址</th>
<th>系统</th>
</tr>
</thead>
<tbody>
<tr>
<td>harbor1</td>
<td>192.168.10.10</td>
<td>centos7.9</td>
</tr>
<tr>
<td>harbor2</td>
<td>192.168.10.11</td>
<td>centos7.9</td>
</tr>
</tbody>
</table>
<h2 id="部署"><a href="#部署" class="headerlink" title="部署"></a>部署</h2><p>Harbor 选择<code>在线部署</code>,使用 docker-compose 部署,docker-compose 和 Docker 部署环境本文不在介绍,网上可以搜索到相关文档。</p>
<h3 id="1、挂载阿里云极速性NAS"><a href="#1、挂载阿里云极速性NAS" class="headerlink" title="1、挂载阿里云极速性NAS"></a>1、挂载阿里云极速性NAS</h3><blockquote>
<p>harbor1 和 harbor2 机器都需要执行挂载 NAS</p>
</blockquote>
<p>配置开机自动挂载,打开 /etc/fstab 配置文件,添加挂载命令。</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 创建 NAS 挂载目录</span></span><br><span class="line">$ mkdir /data</span><br><span class="line"></span><br><span class="line"><span class="comment"># 提高同时发起的NFS请求数量</span></span><br><span class="line">$ sudo <span class="built_in">echo</span> <span class="string">"options sunrpc tcp_slot_table_entries=128"</span> >> /etc/modprobe.d/sunrpc.conf </span><br><span class="line">$ sudo <span class="built_in">echo</span> <span class="string">"options sunrpc tcp_max_slot_table_entries=128"</span> >> /etc/modprobe.d/sunrpc.conf</span><br></pre></td></tr></table></figure>
<ul>
<li>挂载NFS v4文件系统,添加以下命令:</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">file-system-id.region.nas.aliyuncs.com:/ /data nfs vers=4,minorversion=0,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,_netdev,noresvport 0 0</span><br></pre></td></tr></table></figure>
<ul>
<li>如果您要挂载NFS v3文件系统,添加以下命令:</li>
</ul>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">file-system-id.region.nas.aliyuncs.com:/ /data nfs vers=3,nolock,proto=tcp,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,_netdev,noresvport 0 0</span><br></pre></td></tr></table></figure>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在 /etc/fstab 配置文件添加好挂载,并执行挂载</span></span><br><span class="line">$ mount <span class="_">-a</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 检查挂载,如果结果中存在NFS文件系统的挂载地址,则说明挂载成功</span></span><br><span class="line">$ df -h | grep aliyun</span><br></pre></td></tr></table></figure>
<h3 id="2、临时部署单机-Harbor"><a href="#2、临时部署单机-Harbor" class="headerlink" title="2、临时部署单机 Harbor"></a>2、临时部署单机 Harbor</h3><p>在 harbor1 机器上操作</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在线部署Harbor</span></span><br><span class="line">$ <span class="built_in">cd</span> /opt/</span><br><span class="line">$ wget https://github.com/goharbor/harbor/releases/download/v2.2.1/harbor-online-installer-v2.2.1.tgz</span><br><span class="line">$ tar xf harbor-online-installer-v2.2.1.tgz</span><br><span class="line">$ <span class="built_in">cd</span> /opt/harbor</span><br><span class="line">$ cp harbor.yml.tmpl harbor.yml</span><br><span class="line"></span><br><span class="line"><span class="comment"># 创建harbor数据存储</span></span><br><span class="line">$ mkdir /data/harbor</span><br><span class="line"></span><br><span class="line"><span class="comment"># 添加域名证书,已有域名SSL证书</span></span><br><span class="line">$ mkdir /data/harbor/cert</span><br><span class="line"></span><br><span class="line"><span class="comment"># 把SSL证书公钥和私钥上传到 /data/harbor/cert 目录中</span></span><br><span class="line">$ scp harbor.example.pem root@192.168.10.10:/data/harbor/cert/</span><br><span class="line">$ scp harbor.example.key root@192.168.10.10:/data/harbor/cert/</span><br><span class="line"></span><br><span class="line"><span class="comment"># 配置 harbor.yml 文件,下面是修改后文件与原文件比较结果</span></span><br><span class="line">$ diff harbor.yml harbor.yml.tmpl</span><br><span class="line"></span><br><span class="line">5c5</span><br><span class="line">< hostname: harbor.example.com</span><br><span class="line">---</span><br><span class="line">> hostname: reg.mydomain.com</span><br><span class="line">17,18c17,18</span><br><span class="line">< certificate: /data/harbor/cert/harbor.example.pem</span><br><span class="line">< private_key: /data/harbor/cert/harbor.example.key</span><br><span class="line">---</span><br><span class="line">> certificate: /your/certificate/path</span><br><span class="line">> private_key: /your/private/key/path</span><br><span class="line">29c29</span><br><span class="line">< external_url: https://harbor.example.com</span><br><span class="line">---</span><br><span class="line">> <span class="comment"># external_url: https://reg.mydomain.com:8433</span></span><br><span class="line"></span><br><span class="line">< data_volume: /data/harbor</span><br><span class="line">---</span><br><span class="line">> data_volume: /data</span><br><span class="line"></span><br><span class="line"><span class="comment"># 生成配置文件</span></span><br><span class="line">$ <span class="built_in">cd</span> /opt/harbor</span><br><span class="line"></span><br><span class="line"><span class="comment"># harbor开启helm charts 和 镜像漏洞扫描</span></span><br><span class="line">$ ./prepare --with-notary --with-trivy --with-chartmuseum</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装</span></span><br><span class="line">$ ./install.sh --with-notary --with-trivy --with-chartmuseum</span><br><span class="line"></span><br><span class="line"><span class="comment"># 查看</span></span><br><span class="line">$ docker-compose ps</span><br></pre></td></tr></table></figure>
<h3 id="3、单独部署Harbor数据库和Redis"><a href="#3、单独部署Harbor数据库和Redis" class="headerlink" title="3、单独部署Harbor数据库和Redis"></a>3、单独部署Harbor数据库和Redis</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 创建 postgres 和 redis 存储目录</span></span><br><span class="line">$ mkdir -p /data/harbor-redis /data/harbor-postgresql</span><br><span class="line"></span><br><span class="line"><span class="comment"># 修改所属组</span></span><br><span class="line">$ chown -R 999.999 /data/harbor-redis /data/harbor-postgresql</span><br></pre></td></tr></table></figure>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 创建 postgres 和 redis docker-compose.yml 文件</span></span><br><span class="line">$ vim docker-compose.yml</span><br><span class="line"></span><br><span class="line">version: <span class="string">'2.3'</span></span><br><span class="line"></span><br><span class="line">services:</span><br><span class="line"> redis:</span><br><span class="line"> image: goharbor/redis-photon:v2.2.1</span><br><span class="line"> container_name: harbor-redis</span><br><span class="line"> restart: always</span><br><span class="line"> <span class="built_in">cap</span>_drop:</span><br><span class="line"> - ALL</span><br><span class="line"> <span class="built_in">cap</span>_add:</span><br><span class="line"> - CHOWN</span><br><span class="line"> - SETGID</span><br><span class="line"> - SETUID</span><br><span class="line"> volumes:</span><br><span class="line"> - /data/harbor-redis:/var/lib/redis</span><br><span class="line"> networks:</span><br><span class="line"> - harbor-db</span><br><span class="line"> ports:</span><br><span class="line"> - 6379:6379</span><br><span class="line"> postgresql:</span><br><span class="line"> image: goharbor/harbor-db:v2.2.1</span><br><span class="line"> container_name: harbor-postgresql</span><br><span class="line"> restart: always</span><br><span class="line"> <span class="built_in">cap</span>_drop:</span><br><span class="line"> - ALL</span><br><span class="line"> <span class="built_in">cap</span>_add:</span><br><span class="line"> - CHOWN</span><br><span class="line"> - DAC_OVERRIDE</span><br><span class="line"> - SETGID</span><br><span class="line"> - SETUID</span><br><span class="line"> environment:</span><br><span class="line"> POSTGRES_USER: postgres</span><br><span class="line"> POSTGRES_PASSWORD: <span class="built_in">test</span>2021</span><br><span class="line"> volumes:</span><br><span class="line"> - /data/harbor-postgresql:/var/lib/postgresql/data:z</span><br><span class="line"> networks:</span><br><span class="line"> - harbor-db</span><br><span class="line"> ports:</span><br><span class="line"> - 5432:5432</span><br><span class="line"></span><br><span class="line">networks:</span><br><span class="line"> harbor-db:</span><br><span class="line"> driver: bridge</span><br><span class="line"></span><br><span class="line"><span class="comment"># 部署 postgres 和 redis</span></span><br><span class="line">$ docker-compose up <span class="_">-d</span></span><br></pre></td></tr></table></figure>
<h3 id="4、导入-postgres-数据"><a href="#4、导入-postgres-数据" class="headerlink" title="4、导入 postgres 数据"></a>4、导入 postgres 数据</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 进入临时harbor-db容器导出相关表及数据</span></span><br><span class="line">$ docker <span class="built_in">exec</span> -it -u postgres harbor-db bash</span><br><span class="line"></span><br><span class="line"><span class="comment"># 导出数据</span></span><br><span class="line">$ pg_dump -U postgres registry > /tmp/registry.sql </span><br><span class="line">$ pg_dump -U postgres notarysigner > /tmp/notarysigner.sql </span><br><span class="line">$ pg_dump -U postgres notaryserver > /tmp/notaryserver.sql</span><br><span class="line"></span><br><span class="line"><span class="comment"># 将数据导入单独部署的PostgreSQL数据库</span></span><br><span class="line">$ psql -h 192.168.10.10 -U postgres registry -W < /tmp/registry.sql</span><br><span class="line">$ psql -h 192.168.10.10 -U postgres notarysigner -W < /tmp/notarysigner.sql</span><br><span class="line">$ psql -h 192.168.10.10 -U postgres notaryserver -W < /tmp/notaryserver.sql</span><br></pre></td></tr></table></figure>
<h3 id="5、清理临时部署单机Harbor数据和相关配置文件"><a href="#5、清理临时部署单机Harbor数据和相关配置文件" class="headerlink" title="5、清理临时部署单机Harbor数据和相关配置文件"></a>5、清理临时部署单机Harbor数据和相关配置文件</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 清理harbr数据和配置文件</span></span><br><span class="line">$ cp <span class="_">-a</span> /data/harbor/cert /tmp/</span><br><span class="line">$ rm -rf /data/harbor/*</span><br><span class="line">$ rm -rf /opt/harbor</span><br><span class="line">$ cp <span class="_">-a</span> /tmp/cert /data/harbor/</span><br><span class="line"></span><br><span class="line"><span class="comment"># 重新创建配置文件</span></span><br><span class="line">$ <span class="built_in">cd</span> /opt/</span><br><span class="line">$ tar xf harbor-online-installer-v2.2.1.tgz</span><br><span class="line">$ <span class="built_in">cd</span> /opt/harbor</span><br><span class="line"></span><br><span class="line"><span class="comment"># 修改配置文件,连接单独部署postgres和redis,注释harbor自带的postgres和redis</span></span><br><span class="line">$ cp harbor.yml.tmpl harbor.yml</span><br><span class="line">$ diff harbor.yml harbor.yml.tmpl</span><br><span class="line"></span><br><span class="line">5c5</span><br><span class="line">< hostname: harbor.example.com</span><br><span class="line">---</span><br><span class="line">> hostname: reg.mydomain.com</span><br><span class="line">17,18c17,18</span><br><span class="line">< certificate: /data/harbor/cert/harbor.example.pem</span><br><span class="line">< private_key: /data/harbor/cert/harbor.example.key</span><br><span class="line">---</span><br><span class="line">> certificate: /your/certificate/path</span><br><span class="line">> private_key: /your/private/key/path</span><br><span class="line">29c29</span><br><span class="line">< external_url: https://harbor.example.com</span><br><span class="line">---</span><br><span class="line">> <span class="comment"># external_url: https://reg.mydomain.com:8433</span></span><br><span class="line"></span><br><span class="line">37c37</span><br><span class="line">< <span class="comment"># database:</span></span><br><span class="line">---</span><br><span class="line">> database:</span><br><span class="line">39c39</span><br><span class="line">< <span class="comment"># password: root123</span></span><br><span class="line">---</span><br><span class="line">> password: root123</span><br><span class="line">41c41</span><br><span class="line">< <span class="comment"># max_idle_conns: 50</span></span><br><span class="line">---</span><br><span class="line">> max_idle_conns: 50</span><br><span class="line">44c44</span><br><span class="line">< <span class="comment"># max_open_conns: 1000</span></span><br><span class="line">---</span><br><span class="line">> max_open_conns: 1000</span><br><span class="line">47c47</span><br><span class="line"></span><br><span class="line">< data_volume: /data/harbor</span><br><span class="line">---</span><br><span class="line">> data_volume: /data</span><br><span class="line"></span><br><span class="line">135,158c135,158</span><br><span class="line">< external_database:</span><br><span class="line">< harbor:</span><br><span class="line">< host: 192.168.10.10</span><br><span class="line">< port: 5432</span><br><span class="line">< db_name: registry</span><br><span class="line">< username: postgres</span><br><span class="line">< password: <span class="built_in">test</span>2021</span><br><span class="line">< ssl_mode: <span class="built_in">disable</span></span><br><span class="line">< max_idle_conns: 50</span><br><span class="line">< max_open_conns: 1000</span><br><span class="line">< notary_signer:</span><br><span class="line">< host: 192.168.10.10</span><br><span class="line">< port: 5432</span><br><span class="line">< db_name: notarysigner</span><br><span class="line">< username: postgres</span><br><span class="line">< password: <span class="built_in">test</span>2021</span><br><span class="line">< ssl_mode: <span class="built_in">disable</span></span><br><span class="line">< notary_server:</span><br><span class="line">< host: 192.168.10.10</span><br><span class="line">< port: 5432</span><br><span class="line">< db_name: notaryserver</span><br><span class="line">< username: postgres</span><br><span class="line">< password: <span class="built_in">test</span>2021</span><br><span class="line">< ssl_mode: <span class="built_in">disable</span></span><br><span class="line">---</span><br><span class="line">> <span class="comment"># external_database:</span></span><br><span class="line">> <span class="comment"># harbor:</span></span><br><span class="line">> <span class="comment"># host: harbor_db_host</span></span><br><span class="line">> <span class="comment"># port: harbor_db_port</span></span><br><span class="line">> <span class="comment"># db_name: harbor_db_name</span></span><br><span class="line">> <span class="comment"># username: harbor_db_username</span></span><br><span class="line">> <span class="comment"># password: harbor_db_password</span></span><br><span class="line">> <span class="comment"># ssl_mode: disable</span></span><br><span class="line">> <span class="comment"># max_idle_conns: 2</span></span><br><span class="line">> <span class="comment"># max_open_conns: 0</span></span><br><span class="line">> <span class="comment"># notary_signer:</span></span><br><span class="line">> <span class="comment"># host: notary_signer_db_host</span></span><br><span class="line">> <span class="comment"># port: notary_signer_db_port</span></span><br><span class="line">> <span class="comment"># db_name: notary_signer_db_name</span></span><br><span class="line">> <span class="comment"># username: notary_signer_db_username</span></span><br><span class="line">> <span class="comment"># password: notary_signer_db_password</span></span><br><span class="line">> <span class="comment"># ssl_mode: disable</span></span><br><span class="line">> <span class="comment"># notary_server:</span></span><br><span class="line">> <span class="comment"># host: notary_server_db_host</span></span><br><span class="line">> <span class="comment"># port: notary_server_db_port</span></span><br><span class="line">> <span class="comment"># db_name: notary_server_db_name</span></span><br><span class="line">> <span class="comment"># username: notary_server_db_username</span></span><br><span class="line">> <span class="comment"># password: notary_server_db_password</span></span><br><span class="line">> <span class="comment"># ssl_mode: disable</span></span><br><span class="line">161,175c161,175</span><br><span class="line">< external_redis:</span><br><span class="line">< <span class="comment"># support redis, redis+sentinel</span></span><br><span class="line">< <span class="comment"># host for redis: <host_redis>:<port_redis></span></span><br><span class="line">< <span class="comment"># host for redis+sentinel:</span></span><br><span class="line">< <span class="comment"># <host_sentinel1>:<port_sentinel1>,<host_sentinel2>:<port_sentinel2>,<host_sentinel3>:<port_sentinel3></span></span><br><span class="line">< host: 192.168.10.10:6379</span><br><span class="line">< password:</span><br><span class="line">< <span class="comment"># sentinel_master_set must be set to support redis+sentinel</span></span><br><span class="line">< <span class="comment">#sentinel_master_set:</span></span><br><span class="line">< <span class="comment"># db_index 0 is for core, it's unchangeable</span></span><br><span class="line">< registry_db_index: 1</span><br><span class="line">< jobservice_db_index: 2</span><br><span class="line">< chartmuseum_db_index: 3</span><br><span class="line">< trivy_db_index: 5</span><br><span class="line">< idle_timeout_seconds: 30</span><br><span class="line">---</span><br><span class="line">> <span class="comment"># external_redis:</span></span><br><span class="line">> <span class="comment"># # support redis, redis+sentinel</span></span><br><span class="line">> <span class="comment"># # host for redis: <host_redis>:<port_redis></span></span><br><span class="line">> <span class="comment"># # host for redis+sentinel:</span></span><br><span class="line">> <span class="comment"># # <host_sentinel1>:<port_sentinel1>,<host_sentinel2>:<port_sentinel2>,<host_sentinel3>:<port_sentinel3></span></span><br><span class="line">> <span class="comment"># host: redis:6379</span></span><br><span class="line">> <span class="comment"># password:</span></span><br><span class="line">> <span class="comment"># # sentinel_master_set must be set to support redis+sentinel</span></span><br><span class="line">> <span class="comment"># #sentinel_master_set:</span></span><br><span class="line">> <span class="comment"># # db_index 0 is for core, it's unchangeable</span></span><br><span class="line">> <span class="comment"># registry_db_index: 1</span></span><br><span class="line">> <span class="comment"># jobservice_db_index: 2</span></span><br><span class="line">> <span class="comment"># chartmuseum_db_index: 3</span></span><br><span class="line">> <span class="comment"># trivy_db_index: 5</span></span><br><span class="line">> <span class="comment"># idle_timeout_seconds: 30</span></span><br></pre></td></tr></table></figure>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 部署第一个节点 harbor</span></span><br><span class="line">$ <span class="built_in">cd</span> /opt/harbor</span><br><span class="line"></span><br><span class="line"><span class="comment"># harbor开启helm charts 和 镜像漏洞扫描</span></span><br><span class="line">$ ./prepare --with-notary --with-trivy --with-chartmuseum</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装</span></span><br><span class="line">$ ./install.sh --with-notary --with-trivy --with-chartmuseum</span><br><span class="line"></span><br><span class="line"><span class="comment"># 查看</span></span><br><span class="line">$ docker-compose ps</span><br><span class="line"></span><br><span class="line"><span class="comment"># 拷贝配置到 harbor2 机器上</span></span><br><span class="line">$ scp -r /opt/harbor 192.168.10.11:/opt/</span><br></pre></td></tr></table></figure>
<p>在 harbor2 机器上操作</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 部署第二个节点 harbor</span></span><br><span class="line">$ <span class="built_in">cd</span> /opt/harbor</span><br><span class="line"></span><br><span class="line"><span class="comment"># harbor开启helm charts 和 镜像漏洞扫描</span></span><br><span class="line">$ ./prepare --with-notary --with-trivy --with-chartmuseum</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装</span></span><br><span class="line">$ ./install.sh --with-notary --with-trivy --with-chartmuseum</span><br><span class="line"></span><br><span class="line"><span class="comment"># 查看</span></span><br><span class="line">$ docker-compose ps</span><br></pre></td></tr></table></figure>
<h3 id="6、配置阿里云SLB"><a href="#6、配置阿里云SLB" class="headerlink" title="6、配置阿里云SLB"></a>6、配置阿里云SLB</h3><p>不具体介绍SLB配置方法,具体配置方法参考下面阿里云SLB配置文档,配置 443端口,使用 TCP 协议,后端映射到两台 harbor1 和 harbor2 443端口上。</p>
<p><code>SLB配置方法</code>请参考阿里云文档 <a href="https://help.aliyun.com/document_detail/205495.html?spm=a2c4g.11174283.6.666.f9aa1192jngFKC" target="_blank" rel="external">https://help.aliyun.com/document_detail/205495.html?spm=a2c4g.11174283.6.666.f9aa1192jngFKC</a></p>
<h2 id="多实例共享存储架构图"><a href="#多实例共享存储架构图" class="headerlink" title="多实例共享存储架构图"></a>多实例共享存储架构图</h2><p><img src="https://cdm.yp14.cn/img1/0806e2fb37bf7b39ac53f83018a4f47e.png" alt=""></p>
<p>本文 LB 不使用 Nginx,使用阿里SLB。</p>
<p><code>更多 Harbor 架构</code>,请参考 <a href="https://www.yp14.cn/2021/05/09/%E8%81%8A%E8%81%8AHarbor%E6%9E%B6%E6%9E%84/">聊聊Harbor架构</a></p>
<h2 id="本文架构需要考虑三个问题"><a href="#本文架构需要考虑三个问题" class="headerlink" title="本文架构需要考虑三个问题"></a>本文架构需要考虑三个问题</h2><ul>
<li>1、共享存储的选取,Harbor的后端存储目前支持<code>AWS S3</code>、<code>Openstack Swift</code>, <code>Ceph</code>等。本文使用<code>阿里云极速性NAS</code>,磁盘IO性能比单块磁盘读写性能要好。使用 NFS V3 版本挂载。</li>
<li>2、Session 不能在不同的实例上共享,所以Harbor Redis 需要单独部署,并且多个实例连接相同的Redis。</li>
<li>3、Harbor多实例数据库问题,必须单独部署一个数据库,并且多个实例连接相同的数据库。</li>
</ul>
<blockquote>
<p>注意:生产环境如果使用阿里云NAS,推荐使用 <code>极速性NAS</code>,不推荐使用 <code>通用型NAS</code>。</p>
</blockquote>
<p>阿里云NAS性能参考文档 <a href="https://help.aliyun.com/document_detail/124577.html?spm=a2c4g.11186623.6.552.2eb05ea0HJUgUB">https://help.aliyun.com/document_detail/124577.html?spm=a2c4g.11186623.6.552.2eb05ea0HJUgUB</a></p>
聊聊Harbor架构
https://www.yp14.cn/2021/05/09/聊聊Harbor架构/
2021-05-09T04:59:44.000Z
2021-05-09T05:00:33.225Z
<h2 id="Harbor-简介"><a href="#Harbor-简介" class="headerlink" title="Harbor 简介"></a>Harbor 简介</h2><p><code>Harbor</code> 是一个用于存储和分发Docker镜像的企业级Registry服务器。</p>
<p>作为一个企业级私有 Registry 服务器,Harbor 提供了更好的性能和安全。提升用户使用 Registry 构建和运行环境传输镜像的效率。Harbor 支持安装在多个 Registry 节点的镜像资源复制,镜像全部保存在私有 Registry 中, 确保数据和知识产权在公司内部网络中管控。另外,Harbor 也提供了高级的<code>安全特性</code>,诸如<code>用户管理</code>,<code>访问控制</code>和<code>活动审计</code>等。</p>
<ul>
<li><code>基于角色的访问控制</code>:用户与 Docker 镜像仓库通过“项目”进行组织管理,一个用户可以对多个镜像仓库在同一命名空间(project)里有不同的权限。</li>
<li><code>镜像复制</code>:镜像可以在多个 Registry 实例中复制(同步)。尤其适合于负载均衡,高可用,混合云和多云的场景。</li>
<li><code>图形化用户界面</code>:用户可以通过浏览器来浏览,检索当前 Docker 镜像仓库,管理项目和命名空间。</li>
<li><code>AD/LDAP 支持</code>:Harbor 可以集成企业内部已有的 AD/LDAP,用于鉴权认证管理。</li>
<li><code>审计管理</code>:所有针对镜像仓库的操作都可以被记录追溯,用于审计管理。</li>
<li><code>国际化</code>:已拥有英文、中文、德文、日文和俄文的本地化版本。更多的语言将会添加进来。</li>
<li><code>RESTful API</code>:RESTful API 提供给管理员对于 Harbor 更多的操控, 使得与其它管理软件集成变得更容易。</li>
<li><code>部署简单</code>:提供在线和离线两种安装工具, 也可以安装到 vSphere 平台(OVA 方式)虚拟设备。</li>
</ul>
<a id="more"></a>
<h2 id="Harbor-架构"><a href="#Harbor-架构" class="headerlink" title="Harbor 架构"></a>Harbor 架构</h2><h3 id="1、主从同步架构"><a href="#1、主从同步架构" class="headerlink" title="1、主从同步架构"></a>1、主从同步架构</h3><p>Harbor 官方默认提供<code>主从复制</code>的方案来解决镜像同步问题,通过复制的方式,可以实时将测试环境harbor仓库的镜像同步到生产环境harbor,类似于如下流程:</p>
<p><img src="https://cdm.yp14.cn/img1/0d0d1f5e0f7d261d7d4969f30538d794.png" alt=""></p>
<p>在实际生产运维的中,往往需要把镜像发布到几十或上百台集群节点上。这时,单个Registry已经无法满足大量节点的下载需求,因此要配置多个Registry实例做负载均衡。手工维护多个Registry实例上的镜像,将是十分繁琐的事情。Harbor可以支持<code>一主多从的镜像发布模式</code>,可以解决大规模镜像发布的难题:</p>
<p><img src="https://cdm.yp14.cn/img1/43d336bb62c35dc91cc35b88109a9148.png" alt=""></p>
<p>只要往一台Harbor上发布,镜像就会像<code>“仙女散花”</code>般地同步到多个Registry中,高效可靠。</p>
<p>如果是地域分布较广的集群,还可以采用层次型发布方式,比如从集团总部机房同步到分公司1机房,再从分公司1机房同步到分公司2机房:</p>
<p><img src="https://cdm.yp14.cn/img1/8ccf5b9fb07c5f6857bce77c0af73686.png" alt=""></p>
<p>然而单靠主从同步,仍然解决不了harbor主节点的单点问题。继续看下面Harbor架构。</p>
<h3 id="2、双主复制说明"><a href="#2、双主复制说明" class="headerlink" title="2、双主复制说明"></a>2、双主复制说明</h3><p><code>双主复制</code>其实就是复用主从同步实现两个harbor节点之间的<code>双向同步</code>,来保证数据的一致性,然后在两台harbor前端顶一个负载均衡器将进来的请求分流到不同的实例中去,只要有一个实例中有了新的镜像,就是自动的同步复制到另外的的实例中去,这样实现了负载均衡,也避免了单点故障,在一定程度上实现了Harbor的高可用性:</p>
<p><img src="https://cdm.yp14.cn/img1/fc3cca4971781548f39573a2aed60b98.png" alt=""></p>
<p>这个方案有一个问题就是有可能两个Harbor实例中的数据不一致。假设如果一个实例A挂掉了,这个时候有新的镜像进来,那么新的镜像就会在另外一个实例B中,后面即使恢复了挂掉的A实例,Harbor实例B也不会自动去同步镜像,这样只能手动的先关掉Harbor实例B的复制策略,然后再开启复制策略,才能让实例B数据同步,让两个实例的数据一致。</p>
<p>另外,这里还需要多吐槽一句:<code>在实际生产使用中,主从复制十分的不靠谱!</code>所以这里推荐使用下面要说的这种架构方案。</p>
<h3 id="3、多实例共享后端存储"><a href="#3、多实例共享后端存储" class="headerlink" title="3、多实例共享后端存储"></a>3、多实例共享后端存储</h3><p><code>共享后端存储</code>算是一种<code>比较标准的方案</code>,就是多个Harbor实例共享同一个后端存储,任何一个实例持久化到存储的镜像,都可被其他实例中读取。通过前置LB进来的请求,可以分流到不同的实例中去处理,这样就实现了负载均衡,也避免了单点故障。</p>
<p><img src="https://cdm.yp14.cn/img1//0806e2fb37bf7b39ac53f83018a4f47e.png" alt=""></p>
<p>如果最终生产环境集群中服务器较多,依赖做完LB的Harbor也无法完全达到需求时,可以使用如下架构,部署下级Harbor节点从主节点同步镜像,然后再分发给生产服务器。</p>
<p><img src="https://cdm.yp14.cn/img1/b1b8247eb469b08920f7e4b01a6a619f.png" alt=""></p>
<p>这个方案在实际生产环境中部署需要考虑三个问题:</p>
<ul>
<li>1、共享存储的选取,Harbor的后端存储目前支持<code>AWS S3</code>、<code>Openstack Swift</code>, <code>Ceph</code>等,在下节文章在讲解如何部署这种高可用架构,后端存储使用<code>阿里云极速性NAS</code>。</li>
<li>2、Session在不同的实例上共享,这个现在其实已经不是问题了,在最新的harbor中,默认session会存放在redis中,只需要将redis独立出来即可。可以通过redis sentinel或者redis cluster等方式来保证redis的可用性。不过单台Redis也可以,只是Redis没有高可用。</li>
<li>3、Harbor多实例数据库问题,这个也只需要将harbor中的数据库拆出来独立部署即可。让多实例共用一个外部数据库,数据库的高可用也可以通过数据库的高可用方案保证。</li>
</ul>
<h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><p>上文简单介绍 Harbor 不同的架构,下文再介绍 <code>多实例共享后端存储</code> 架构如何部署?使用<code>阿里云极速性NAS</code>作为后端存储。敬请期待。。。</p>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="http://www.yunweipai.com/39320.html" target="_blank" rel="external">http://www.yunweipai.com/39320.html</a></li>
</ul>
<h2 id="Harbor-简介"><a href="#Harbor-简介" class="headerlink" title="Harbor 简介"></a>Harbor 简介</h2><p><code>Harbor</code> 是一个用于存储和分发Docker镜像的企业级Registry服务器。</p>
<p>作为一个企业级私有 Registry 服务器,Harbor 提供了更好的性能和安全。提升用户使用 Registry 构建和运行环境传输镜像的效率。Harbor 支持安装在多个 Registry 节点的镜像资源复制,镜像全部保存在私有 Registry 中, 确保数据和知识产权在公司内部网络中管控。另外,Harbor 也提供了高级的<code>安全特性</code>,诸如<code>用户管理</code>,<code>访问控制</code>和<code>活动审计</code>等。</p>
<ul>
<li><code>基于角色的访问控制</code>:用户与 Docker 镜像仓库通过“项目”进行组织管理,一个用户可以对多个镜像仓库在同一命名空间(project)里有不同的权限。</li>
<li><code>镜像复制</code>:镜像可以在多个 Registry 实例中复制(同步)。尤其适合于负载均衡,高可用,混合云和多云的场景。</li>
<li><code>图形化用户界面</code>:用户可以通过浏览器来浏览,检索当前 Docker 镜像仓库,管理项目和命名空间。</li>
<li><code>AD/LDAP 支持</code>:Harbor 可以集成企业内部已有的 AD/LDAP,用于鉴权认证管理。</li>
<li><code>审计管理</code>:所有针对镜像仓库的操作都可以被记录追溯,用于审计管理。</li>
<li><code>国际化</code>:已拥有英文、中文、德文、日文和俄文的本地化版本。更多的语言将会添加进来。</li>
<li><code>RESTful API</code>:RESTful API 提供给管理员对于 Harbor 更多的操控, 使得与其它管理软件集成变得更容易。</li>
<li><code>部署简单</code>:提供在线和离线两种安装工具, 也可以安装到 vSphere 平台(OVA 方式)虚拟设备。</li>
</ul>
K8S Cluster Autoscaler 集群自动伸缩
https://www.yp14.cn/2021/04/21/K8S-Cluster-Autoscaler-集群自动伸缩/
2021-04-21T06:00:23.000Z
2021-04-21T06:01:06.461Z
<h2 id="什么是-cluster-autoscaler"><a href="#什么是-cluster-autoscaler" class="headerlink" title="什么是 cluster-autoscaler"></a>什么是 cluster-autoscaler</h2><p><code>Cluster Autoscaler</code> (CA)是一个独立程序,是用来弹性伸缩kubernetes集群的。在使用kubernetes集群经常问到的一个问题是,应该保持多大的节点规模来满足应用需求呢? cluster-autoscaler 出现解决了这个问题,它可以自动根据部署应用所请求资源量来动态的伸缩集群。</p>
<blockquote>
<p>项目地址:<a href="https://github.com/kubernetes/autoscaler" target="_blank" rel="external">https://github.com/kubernetes/autoscaler</a></p>
</blockquote>
<h2 id="Cluster-Autoscaler-什么时候伸缩集群?"><a href="#Cluster-Autoscaler-什么时候伸缩集群?" class="headerlink" title="Cluster Autoscaler 什么时候伸缩集群?"></a>Cluster Autoscaler 什么时候伸缩集群?</h2><p>在以下情况下,集群自动扩容或者缩放:</p>
<ul>
<li><code>扩容</code>:由于资源不足,某些Pod无法在任何当前节点上进行调度</li>
<li><code>缩容</code>: Node节点资源利用率较低时,且此node节点上存在的pod都能被重新调度到其他node节点上运行</li>
</ul>
<a id="more"></a>
<h2 id="什么时候集群节点不会被-CA-删除?"><a href="#什么时候集群节点不会被-CA-删除?" class="headerlink" title="什么时候集群节点不会被 CA 删除?"></a>什么时候集群节点不会被 CA 删除?</h2><ul>
<li>节点上有pod被 <code>PodDisruptionBudget</code> 控制器限制。</li>
<li>节点上有命名空间是 <code>kube-system</code> 的pods。</li>
<li>节点上的pod不是被控制器创建,例如不是被deployment, replica set, job, stateful set创建。</li>
<li>节点上有pod使用了本地存储</li>
<li>节点上pod驱逐后无处可去,即没有其他node能调度这个pod</li>
<li>节点有注解:<code>"cluster-autoscaler.kubernetes.io/scale-down-disabled": "true"</code> (在CA 1.0.3或更高版本中受支持)</li>
</ul>
<h2 id="Horizontal-Pod-Autoscaler-如何与-Cluster-Autoscaler-一起使用?"><a href="#Horizontal-Pod-Autoscaler-如何与-Cluster-Autoscaler-一起使用?" class="headerlink" title="Horizontal Pod Autoscaler 如何与 Cluster Autoscaler 一起使用?"></a>Horizontal Pod Autoscaler 如何与 Cluster Autoscaler 一起使用?</h2><p>Horizontal Pod Autoscaler 会根据当前CPU负载更改部署或副本集的副本数。如果负载增加,则HPA将创建新的副本,集群中可能有足够的空间,也可能没有足够的空间。如果没有足够的资源,CA将尝试启动一些节点,以便HPA创建的Pod可以运行。如果负载减少,则HPA将停止某些副本。结果,某些节点可能变得利用率过低或完全为空,然后CA将终止这些不需要的节点。</p>
<h2 id="如何防止节点被CA删除"><a href="#如何防止节点被CA删除" class="headerlink" title="如何防止节点被CA删除?"></a>如何防止节点被CA删除?</h2><p>从<code>CA 1.0</code>开始,节点可以打上以下标签:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">"cluster-autoscaler.kubernetes.io/scale-down-disabled"</span>: <span class="string">"true"</span></span><br></pre></td></tr></table></figure>
<p>可以使用 <code>kubectl</code> 将其添加到节点(或从节点删除):</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ kubectl annotate node <nodename> cluster-autoscaler.kubernetes.io/scale-down-disabled=<span class="literal">true</span></span><br></pre></td></tr></table></figure>
<h2 id="运行Cluster-Autoscaler-最佳实践?"><a href="#运行Cluster-Autoscaler-最佳实践?" class="headerlink" title="运行Cluster Autoscaler 最佳实践?"></a>运行Cluster Autoscaler 最佳实践?</h2><ul>
<li>不要直接修改属于自动伸缩节点组的节点。同一节点组中的所有节点应该具有相同的容量、标签和在其上运行的系统pod</li>
<li>Pod 声明 requests 资源限制</li>
<li>使用 <code>PodDisruptionBudgets</code> 可以防止突然删除Pod(如果需要)</li>
<li>再为节点池指定最小/最大设置之前,请检查您的云提供商的配额是否足够大</li>
<li>不要运行任何其他节点组自动缩放器(尤其是来自您的云提供商的自动缩放器)</li>
</ul>
<h2 id="Cluster-Autoscaler-支持那些云厂商"><a href="#Cluster-Autoscaler-支持那些云厂商" class="headerlink" title="Cluster Autoscaler 支持那些云厂商?"></a>Cluster Autoscaler 支持那些云厂商?</h2><ul>
<li><code>GCE</code> <a href="https://kubernetes.io/docs/concepts/cluster-administration/cluster-management/" target="_blank" rel="external">https://kubernetes.io/docs/concepts/cluster-administration/cluster-management/</a></li>
<li><code>GKE</code> <a href="https://cloud.google.com/container-engine/docs/cluster-autoscaler" target="_blank" rel="external">https://cloud.google.com/container-engine/docs/cluster-autoscaler</a></li>
<li><code>AWS</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md</a></li>
<li><code>Azure</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/azure/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/azure/README.md</a></li>
<li><code>Alibaba Cloud</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/alicloud/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/alicloud/README.md</a></li>
<li><code>OpenStack Magnum</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/magnum/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/magnum/README.md</a></li>
<li><code>DigitalOcean</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/digitalocean/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/digitalocean/README.md</a></li>
<li><code>CloudStack</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/cloudstack/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/cloudstack/README.md</a></li>
<li><code>Exoscale</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/exoscale/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/exoscale/README.md</a></li>
<li><code>Packet</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/packet/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/packet/README.md</a></li>
<li><code>OVHcloud</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/ovhcloud/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/ovhcloud/README.md</a></li>
<li><code>Linode</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/linode/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/linode/README.md</a></li>
<li><code>Hetzner</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/hetzner/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/hetzner/README.md</a></li>
<li><code>Cluster API</code> <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/README.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/README.md</a></li>
</ul>
<h2 id="Cluster-Autoscaler-部署-和-更多实践"><a href="#Cluster-Autoscaler-部署-和-更多实践" class="headerlink" title="Cluster Autoscaler 部署 和 更多实践"></a>Cluster Autoscaler 部署 和 更多实践</h2><p>请参考链接:<a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md" target="_blank" rel="external">https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md</a></p>
<h2 id="参考链接:"><a href="#参考链接:" class="headerlink" title="参考链接:"></a>参考链接:</h2><ul>
<li><a href="https://github.com/kubernetes/autoscaler" target="_blank" rel="external">https://github.com/kubernetes/autoscaler</a></li>
<li><a href="https://blog.csdn.net/hello2mao/article/details/80418625" target="_blank" rel="external">https://blog.csdn.net/hello2mao/article/details/80418625</a></li>
</ul>
<h2 id="什么是-cluster-autoscaler"><a href="#什么是-cluster-autoscaler" class="headerlink" title="什么是 cluster-autoscaler"></a>什么是 cluster-autoscaler</h2><p><code>Cluster Autoscaler</code> (CA)是一个独立程序,是用来弹性伸缩kubernetes集群的。在使用kubernetes集群经常问到的一个问题是,应该保持多大的节点规模来满足应用需求呢? cluster-autoscaler 出现解决了这个问题,它可以自动根据部署应用所请求资源量来动态的伸缩集群。</p>
<blockquote>
<p>项目地址:<a href="https://github.com/kubernetes/autoscaler">https://github.com/kubernetes/autoscaler</a></p>
</blockquote>
<h2 id="Cluster-Autoscaler-什么时候伸缩集群?"><a href="#Cluster-Autoscaler-什么时候伸缩集群?" class="headerlink" title="Cluster Autoscaler 什么时候伸缩集群?"></a>Cluster Autoscaler 什么时候伸缩集群?</h2><p>在以下情况下,集群自动扩容或者缩放:</p>
<ul>
<li><code>扩容</code>:由于资源不足,某些Pod无法在任何当前节点上进行调度</li>
<li><code>缩容</code>: Node节点资源利用率较低时,且此node节点上存在的pod都能被重新调度到其他node节点上运行</li>
</ul>
十道Kubernetes面试题
https://www.yp14.cn/2021/04/11/十道Kubernetes面试题/
2021-04-11T09:42:58.000Z
2021-04-11T09:43:38.167Z
<h3 id="一、什么是Kubernetes集群中的minions?"><a href="#一、什么是Kubernetes集群中的minions?" class="headerlink" title="一、什么是Kubernetes集群中的minions?"></a>一、什么是Kubernetes集群中的minions?</h3><ul>
<li>1、它们是主节点的组件。</li>
<li>2、它们是集群的工作节点。[答案]</li>
<li>3、他们正在监控kubernetes中广泛使用的引擎。</li>
<li>4、他们是docker容器服务。</li>
</ul>
<h3 id="二、Kubernetes集群数据存储在以下哪个位置?"><a href="#二、Kubernetes集群数据存储在以下哪个位置?" class="headerlink" title="二、Kubernetes集群数据存储在以下哪个位置?"></a>二、Kubernetes集群数据存储在以下哪个位置?</h3><ul>
<li>1、KUBE-API服务器</li>
<li>2、Kubelet</li>
<li>3、ETCD [答案]</li>
<li>4、以上都不是</li>
</ul>
<a id="more"></a>
<h3 id="三、哪个是Kubernetes控制器?"><a href="#三、哪个是Kubernetes控制器?" class="headerlink" title="三、哪个是Kubernetes控制器?"></a>三、哪个是Kubernetes控制器?</h3><ul>
<li>1、ReplicaSet</li>
<li>2、Deployment</li>
<li>3、Rolling Updates</li>
<li>4、ReplicaSet和Deployment [答案]</li>
</ul>
<h3 id="四、以下哪个是核心Kubernetes对象?"><a href="#四、以下哪个是核心Kubernetes对象?" class="headerlink" title="四、以下哪个是核心Kubernetes对象?"></a>四、以下哪个是核心Kubernetes对象?</h3><ul>
<li>1、Pods</li>
<li>2、Services</li>
<li>3、Volumes</li>
<li>4、以上所有[答案]</li>
</ul>
<h3 id="五、Kubernetes-Network代理在哪个节点上运行?"><a href="#五、Kubernetes-Network代理在哪个节点上运行?" class="headerlink" title="五、Kubernetes Network代理在哪个节点上运行?"></a>五、Kubernetes Network代理在哪个节点上运行?</h3><ul>
<li>1、Master Node</li>
<li>2、Worker Node</li>
<li>3、所有节点[答案]</li>
<li>4、以上都不是</li>
</ul>
<h3 id="六、节点控制器的职责是什么?"><a href="#六、节点控制器的职责是什么?" class="headerlink" title="六、节点控制器的职责是什么?"></a>六、节点控制器的职责是什么?</h3><ul>
<li>1、将CIDR块分配给节点</li>
<li>2、维护节点列表</li>
<li>3、监视节点的运行状况</li>
<li>4、以上所有[答案]</li>
</ul>
<h3 id="七、Replication-Controller的职责是什么?"><a href="#七、Replication-Controller的职责是什么?" class="headerlink" title="七、Replication Controller的职责是什么?"></a>七、Replication Controller的职责是什么?</h3><ul>
<li>1、使用单个命令更新或删除多个pod</li>
<li>2、有助于达到理想状态</li>
<li>3、如果现有Pod崩溃,则创建新Pod</li>
<li>4、以上所有[答案]</li>
</ul>
<h3 id="八、如何在没有选择器的情况下定义服务?"><a href="#八、如何在没有选择器的情况下定义服务?" class="headerlink" title="八、如何在没有选择器的情况下定义服务?"></a>八、如何在没有选择器的情况下定义服务?</h3><ul>
<li>1、指定外部名称[答案]</li>
<li>2、指定具有IP地址和端口的端点</li>
<li>3、只需指定IP地址即可</li>
<li>4、指定标签和api版本</li>
</ul>
<h3 id="九、1-8版本的Kubernetes引入了什么?"><a href="#九、1-8版本的Kubernetes引入了什么?" class="headerlink" title="九、1.8版本的Kubernetes引入了什么?"></a>九、1.8版本的Kubernetes引入了什么?</h3><ul>
<li>1、Taints and Tolerations [答案]</li>
<li>2、Cluster level Logging</li>
<li>3、Secrets</li>
<li>4、Federated Clusters</li>
</ul>
<h3 id="十、Kubelet-调用的处理检查容器的IP地址是否打开的程序是?"><a href="#十、Kubelet-调用的处理检查容器的IP地址是否打开的程序是?" class="headerlink" title="十、Kubelet 调用的处理检查容器的IP地址是否打开的程序是?"></a>十、Kubelet 调用的处理检查容器的IP地址是否打开的程序是?</h3><ul>
<li>1、HTTPGetAction</li>
<li>2、ExecAction</li>
<li>3、TCPSocketAction [答案]</li>
<li>4、以上都不是</li>
</ul>
<blockquote>
<ul>
<li>作者:fiisio</li>
<li>原文链接:<a href="https://zhuanlan.zhihu.com/p/74560934" target="_blank" rel="external">https://zhuanlan.zhihu.com/p/74560934</a></li>
</ul>
</blockquote>
<h3 id="一、什么是Kubernetes集群中的minions?"><a href="#一、什么是Kubernetes集群中的minions?" class="headerlink" title="一、什么是Kubernetes集群中的minions?"></a>一、什么是Kubernetes集群中的minions?</h3><ul>
<li>1、它们是主节点的组件。</li>
<li>2、它们是集群的工作节点。[答案]</li>
<li>3、他们正在监控kubernetes中广泛使用的引擎。</li>
<li>4、他们是docker容器服务。</li>
</ul>
<h3 id="二、Kubernetes集群数据存储在以下哪个位置?"><a href="#二、Kubernetes集群数据存储在以下哪个位置?" class="headerlink" title="二、Kubernetes集群数据存储在以下哪个位置?"></a>二、Kubernetes集群数据存储在以下哪个位置?</h3><ul>
<li>1、KUBE-API服务器</li>
<li>2、Kubelet</li>
<li>3、ETCD [答案]</li>
<li>4、以上都不是</li>
</ul>
Redis如何删除数量过万以上Key而不影响业务
https://www.yp14.cn/2021/03/21/Redis如何删除数量过万以上Key而不影响业务/
2021-03-21T03:33:30.000Z
2021-03-22T02:48:36.095Z
<h2 id="需求"><a href="#需求" class="headerlink" title="需求"></a>需求</h2><p>有时候因为 <code>Redis Key</code> 没有设置过期时间或者因为业务需求或者Redis内存不足或者修改Redis Key值等需求,并且这些Key是有规律的,可以通过正则表达式来匹配。</p>
<h2 id="解决方法一"><a href="#解决方法一" class="headerlink" title="解决方法一"></a>解决方法一</h2><p>一般通过网上搜索,会告诉你使用下面方法,Redis 提供了一个简单暴力的指令 <code>keys</code> 用来列出所有满足特定正则字符串规则的 <code>key</code>。</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ redis-cli --raw keys <span class="string">"testkey-*"</span> | xargs redis-cli del</span><br></pre></td></tr></table></figure>
<p>通过 Redis keys 来匹配你需要删除的key,再使用 xargs 把结果传给 redis-cli del ,这样看似完美,实则有很大风险。</p>
<a id="more"></a>
<p>上面命令使用非常简单,提供一个简单的正则字符串即可,但是有很明显的两个缺点。</p>
<ul>
<li>没有 offset、limit 参数,一次性吐出所有满足条件的 key,万一实例中有几百 w 个 key 满足条件,当你看到满屏的字符串刷的没有尽头时,你就知道难受了。</li>
<li>keys 算法是遍历算法,复杂度是 O(n),如果实例中有千万级以上的 key,这个指令就会导致 Redis 服务卡顿,所有读写 Redis 的其它的指令都会被延后甚至会超时报错,因为 Redis 6 版本以下都是<code>单线程程序</code>,顺序执行所有指令,其它指令必须等到当前的 keys 指令执行完了才可以继续,这样就会导致业务不可用,甚至造成redis宕机的风险。</li>
</ul>
<blockquote>
<p>注意:这种方法<code>不推荐</code>,建议生产环境<code>屏蔽keys命令</code>。那大家会问,有没有更好的方法来解决这个问题?答案是当然用,请接着看下文。</p>
</blockquote>
<h2 id="解决方法二"><a href="#解决方法二" class="headerlink" title="解决方法二"></a>解决方法二</h2><p>Redis从2.8版本开始支持 <code>scan</code> 命令,SCAN命令的基本用法如下:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">SCAN cursor [MATCH pattern] [COUNT count]</span><br></pre></td></tr></table></figure>
<ul>
<li><code>cursor</code>: <code>游标</code>,SCAN命令是一个基于游标的迭代器,SCAN命令每次被调用之后,都会向用户返回一个新的游标,用户在下次迭代时需要使用这个新游标作为SCAN命令的游标参数,以此来延续之前的迭代过程,直到服务器向用户返回值为0的游标时,一次完整的遍历过程就结束了。</li>
<li><code>MATCH</code>: <code>匹配规则</code>,例如遍历以 <code>testkey-</code> 开头的所有key可以写成 <code>testkey-*</code>。</li>
<li><code>COUNT</code>:<code>COUNT</code>选项的作用就是让用户告知迭代命令,在每次迭代中应该从数据集里返回多少元素,COUNT只是对增量式迭代命令的一种提示,并不代表真正返回的数量,例如你COUNT设置为2有可能会返回3个元素,但返回的元素数据会与COUNT设置的正相关,COUNT的默认值是10。</li>
</ul>
<p>例子:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line">$ scan 0 MATCH testkey-*</span><br><span class="line"></span><br><span class="line">1) <span class="string">"34"</span></span><br><span class="line">2) 1) <span class="string">"testkey-2"</span></span><br><span class="line"> 2) <span class="string">"testkey-49"</span></span><br><span class="line"> 3) <span class="string">"testkey-20"</span></span><br><span class="line"> 4) <span class="string">"testkey-19"</span></span><br><span class="line"> 5) <span class="string">"testkey-93"</span></span><br><span class="line"> 6) <span class="string">"testkey-8"</span></span><br><span class="line"> 7) <span class="string">"testkey-34"</span></span><br><span class="line"> 8) <span class="string">"testkey-76"</span></span><br><span class="line"> 9) <span class="string">"testkey-13"</span></span><br><span class="line"> 10) <span class="string">"testkey-18"</span></span><br><span class="line"> 11) <span class="string">"testkey-10"</span></span><br><span class="line"></span><br><span class="line">$ scan 34 MATCH testkey-* COUNT 1000</span><br><span class="line"></span><br><span class="line">1) <span class="string">"0"</span></span><br><span class="line">2) 1) <span class="string">"testkey-16"</span></span><br><span class="line"> 2) <span class="string">"testkey-19"</span></span><br><span class="line"> 3) <span class="string">"testkey-23"</span></span><br><span class="line"> 4) <span class="string">"testkey-21"</span></span><br><span class="line"> 5) <span class="string">"testkey-40"</span></span><br><span class="line"> 6) <span class="string">"testkey-22"</span></span><br><span class="line"> 7) <span class="string">"testkey-1"</span></span><br><span class="line"> 8) <span class="string">"testkey-11"</span></span><br><span class="line"> 9) <span class="string">"testkey-28"</span></span><br><span class="line"> 10) <span class="string">"testkey-3"</span></span><br><span class="line"> 11) <span class="string">"testkey-26"</span></span><br><span class="line"> 12) <span class="string">"testkey-4"</span></span><br><span class="line"> 13) <span class="string">"testkey-31"</span></span><br><span class="line"> ...</span><br></pre></td></tr></table></figure>
<p>scan 命令返回的是一个包含两个元素的数组,第一个数组元素是用于进行下一次迭代的新游标,而第二个数组元素则是一个数组,这个数组中包含了所有被迭代的元素。</p>
<p>上面这个例子的意思是扫描所有前缀为<code>testkey-</code>的key。第一次迭代使用0作为游标,表示开始一次新的迭代,同时使用了MATCH匹配前缀为testkey-的key,返回了游标值34以及遍历到的数据。第二次迭代使用的是第一次迭代时返回的游标,也即是命令回复第一个元素的值34,同时通过将COUNT选项的参数设置为1000,强制命令为本次迭代扫描更多元素。在第二次调用SCAN命令时,命令返回了游标0,这表示迭代已经结束,整个数据集已经被完整遍历过了。</p>
<p><code>Redis scan</code> 命令就是<code>基于游标的迭代器</code>,意味着命令每次被调用都需要使用上一次这个调用返回的游标作为该次调用的游标参数,以此来延续之前的迭代过程。当SCAN命令的游标参数被设置为0时,服务器将开始一次新的迭代,而当redis服务器向用户返回值为0的游标时,表示迭代已结束,这是唯一迭代结束的判定方式,而不能通过返回结果集是否为空判断迭代结束。</p>
<p>上面的需求,最终可以使用下面命令来解决:</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ redis-cli --scan --pattern <span class="string">"testkey-*"</span> | xargs -L 1000 redis-cli del</span><br></pre></td></tr></table></figure>
<blockquote>
<p><code>xargs -L</code> 指令表示xargs一次读取的行数,也就是每次删除key的数量,不要一次行读取太多数量key。</p>
</blockquote>
<h2 id="scan-与-keys-比较"><a href="#scan-与-keys-比较" class="headerlink" title="scan 与 keys 比较"></a>scan 与 keys 比较</h2><p><code>scan</code> 相比 <code>keys</code> 具备有以下特点:</p>
<ul>
<li>复杂度虽然也是 O(n),但是它是通过游标分步进行的,不会阻塞线程。</li>
<li>提供 limit 参数,可以控制每次返回结果的最大条数,limit 只是对增量式迭代命令的一种提示(hint),返回的结果可多可少。</li>
<li>同 keys 一样,它也提供模式匹配功能。</li>
<li>服务器不需要为游标保存状态,游标的唯一状态就是 scan 返回给客户端的游标整数。</li>
<li>返回的结果可能会有重复,需要客户端去重复,这点非常重要。</li>
<li>遍历的过程中如果有数据修改,改动后的数据能不能遍历到是不确定的。</li>
<li>单次返回的结果是空的并不意味着遍历结束,而要看返回的游标值是否为零。</li>
</ul>
<h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><p>Redis 类似 scan 命令还有很多,比如:</p>
<ul>
<li><code>scan</code> 指令是一系列指令,除了可以遍历所有的 key 之外,还可以对指定的容器集合进行遍历</li>
<li><code>zscan</code> 遍历 zset 集合元素</li>
<li><code>hscan</code> 遍历 hash 字典的元素</li>
<li><code>sscan</code> 遍历 set 集合的元素</li>
</ul>
<blockquote>
<p>注意:SSCAN 命令、 HSCAN 命令和 ZSCAN 命令的第一个参数总是一个数据库键。而 SCAN 命令则不需要在第一个参数提供任何数据库键,因为它迭代的是当前数据库中的所有数据库键。</p>
</blockquote>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="http://jinguoxing.github.io/redis/2018/09/04/redis-scan/" target="_blank" rel="external">http://jinguoxing.github.io/redis/2018/09/04/redis-scan/</a></li>
<li><a href="https://juejin.cn/post/6844903869412016142" target="_blank" rel="external">https://juejin.cn/post/6844903869412016142</a></li>
</ul>
<h2 id="需求"><a href="#需求" class="headerlink" title="需求"></a>需求</h2><p>有时候因为 <code>Redis Key</code> 没有设置过期时间或者因为业务需求或者Redis内存不足或者修改Redis Key值等需求,并且这些Key是有规律的,可以通过正则表达式来匹配。</p>
<h2 id="解决方法一"><a href="#解决方法一" class="headerlink" title="解决方法一"></a>解决方法一</h2><p>一般通过网上搜索,会告诉你使用下面方法,Redis 提供了一个简单暴力的指令 <code>keys</code> 用来列出所有满足特定正则字符串规则的 <code>key</code>。</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ redis-cli --raw keys <span class="string">"testkey-*"</span> | xargs redis-cli del</span><br></pre></td></tr></table></figure>
<p>通过 Redis keys 来匹配你需要删除的key,再使用 xargs 把结果传给 redis-cli del ,这样看似完美,实则有很大风险。</p>
Nginx 配置可视化管理
https://www.yp14.cn/2021/03/14/Nginx-配置可视化管理/
2021-03-14T05:15:58.000Z
2021-03-14T05:17:31.832Z
<h2 id="Nginx-配置可视化UI展示"><a href="#Nginx-配置可视化UI展示" class="headerlink" title="Nginx 配置可视化UI展示"></a>Nginx 配置可视化UI展示</h2><p><img src="https://cdm.yp14.cn/img1/nginx-login.png" alt="Login"></p>
<p><img src="https://cdm.yp14.cn/img1/nginx-home.png" alt="Home"></p>
<a id="more"></a>
<p><img src="https://cdm.yp14.cn/img1/nginx-upstream.png" alt="Upstream"></p>
<p><img src="https://cdm.yp14.cn/img1/nginx-lisner.png" alt="listen"></p>
<p><img src="https://cdm.yp14.cn/img1/nginx-location.png" alt="Location"></p>
<p><img src="https://cdm.yp14.cn/img1/nginx-conf.png" alt="Conf"></p>
<h2 id="功能"><a href="#功能" class="headerlink" title="功能"></a>功能</h2><ul>
<li>Nginx 可视化管理</li>
<li>Nginx 配置管理</li>
<li>Nginx 性能监控</li>
</ul>
<h2 id="部署"><a href="#部署" class="headerlink" title="部署"></a>部署</h2><h3 id="快速部署"><a href="#快速部署" class="headerlink" title="快速部署"></a>快速部署</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">docker run --detach \</span><br><span class="line">--publish 80:80 --publish 8889:8889 \</span><br><span class="line">--name nginx_ui \</span><br><span class="line">--restart always \</span><br><span class="line">crazyleojay/nginx_ui:latest</span><br></pre></td></tr></table></figure>
<h3 id="数据持久化部署"><a href="#数据持久化部署" class="headerlink" title="数据持久化部署"></a>数据持久化部署</h3><p><code>配置文件路径</code>:/usr/local/nginx/conf/nginx.conf</p>
<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">docker run --detach \</span><br><span class="line">--publish 80:80 --publish 8889:8889 \</span><br><span class="line">--name nginx_ui \</span><br><span class="line">--restart always \</span><br><span class="line">--volume /home/nginx.conf:/usr/<span class="built_in">local</span>/nginx/conf/nginx.conf \</span><br><span class="line">crazyleojay/nginx_ui:latest</span><br></pre></td></tr></table></figure>
<blockquote>
<p>项目地址:<a href="https://github.com/onlyGuo/nginx-gui" target="_blank" rel="external">https://github.com/onlyGuo/nginx-gui</a></p>
</blockquote>
<h2 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h2><p>该项目适用于<code>测试环境</code>或者<code>本地开发环境</code>不适合<code>生产环境</code>,提供给不懂Nginx配置人员使用,通过Web界面能简单的配置。</p>
<h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul>
<li><a href="https://github.com/onlyGuo/nginx-gui" target="_blank" rel="external">https://github.com/onlyGuo/nginx-gui</a></li>
</ul>
<h2 id="Nginx-配置可视化UI展示"><a href="#Nginx-配置可视化UI展示" class="headerlink" title="Nginx 配置可视化UI展示"></a>Nginx 配置可视化UI展示</h2><p><img src="https://cdm.yp14.cn/img1/nginx-login.png" alt="Login"></p>
<p><img src="https://cdm.yp14.cn/img1/nginx-home.png" alt="Home"></p>
Kubernetes(k8s)那些套路之日志收集
https://www.yp14.cn/2021/03/04/Kubernetes-k8s-那些套路之日志收集/
2021-03-04T13:49:19.000Z
2021-03-04T13:51:45.578Z
<h2 id="关于容器日志"><a href="#关于容器日志" class="headerlink" title="关于容器日志"></a>关于容器日志</h2><p>Docker的日志分为两类,一类是 Docker引擎日志;另一类是容器日志。引擎日志一般都交给了系统日志,不同的操作系统会放在不同的位置。本文主要介绍容器日志,容器日志可以理解是运行在容器内部的应用输出的日志,默认情况下,docker logs 显示当前运行的容器的日志信息,内容包含 STOUT(标准输出) 和 STDERR(标准错误输出)。日志都会以 json-file 的格式存储于 <code>/var/lib/docker/containers/<容器id>/<容器id>-json.log</code> ,不过这种方式并不适合放到生产环境中。</p>
<ul>
<li>默认方式下容器日志并不会限制日志文件的大小,容器会一直写日志,导致磁盘爆满,影响系统应用。(docker log-driver 支持log文件的rotate)</li>
<li>Docker Daemon 收集容器的标准输出,当日志量过大时会导致Docker Daemon 成为日志收集的瓶颈,日志的收集速度受限。</li>
<li>日志文件量过大时,利用docker logs -f 查看时会直接将Docker Daemon阻塞住,造成docker ps等命令也不响应。</li>
</ul>
<p>Docker提供了logging drivers配置,用户可以根据自己的需求去配置不同的log-driver,可参考官网 <a href="https//docs.docker.com/v17.09/engine/admin/logging/overview/">Configure logging drivers</a> 。但是上述配置的日志收集也是通过Docker Daemon收集,收集日志的速度依然是瓶颈。</p>
<blockquote>
<ul>
<li>log-driver 日志收集速度</li>
<li>syslog 14.9 MB/s</li>
<li>json-file 37.9 MB/s</li>
</ul>
</blockquote>
<p>能不能找到不通过Docker Daemon收集日志直接将日志内容重定向到文件并自动 rotate的工具呢?答案是肯定的采用<code>S6</code>基底镜像。</p>
<p>S6-log 将 CMD 的标准输出重定向到/…/default/current,而不是发送到 Docker Daemon,这样就避免了 Docker Daemon 收集日志的性能瓶颈。本文就是采用<code>S6</code>基底镜像构建应用镜像形成统一日志收集方案。</p>
<blockquote>
<p>S6:<a href="http://skarnet.org/software/s6/" target="_blank" rel="external">http://skarnet.org/software/s6/</a></p>
</blockquote>
<a id="more"></a>
<h2 id="关于k8s日志"><a href="#关于k8s日志" class="headerlink" title="关于k8s日志"></a>关于k8s日志</h2><p>k8s日志收集方案分成三个级别:</p>
<ul>
<li>应用(Pod)级别</li>
<li>节点级别</li>
<li>集群级别</li>
</ul>
<h3 id="应用-Pod-级别"><a href="#应用-Pod-级别" class="headerlink" title="应用(Pod)级别"></a>应用(Pod)级别</h3><p>Pod级别的日志 , 默认是输出到标准输出和标志输入,实际上跟docker 容器的一致。使用 kubectl logs pod-name -n namespace 查看。</p>
<h3 id="节点级别"><a href="#节点级别" class="headerlink" title="节点级别"></a>节点级别</h3><p>Node级别的日志 , 通过配置容器的log-driver来进行管理 , 这种需要配合logrotare来进行 , 日志超过最大限制 , 自动进行rotate操作。</p>
<p><img src="https://cdm.yp14.cn/img1/v2-720fa0c0e457b92ec73b56d2ab8ab0a3_1440w.jpg" alt=""></p>
<h3 id="集群级别"><a href="#集群级别" class="headerlink" title="集群级别"></a>集群级别</h3><p>集群级别的日志收集,有三种</p>
<ul>
<li><code>节点代理方式</code>,在node级别进行日志收集。一般使用DaemonSet部署在每个node中。这种方式优点是耗费资源少,因为只需部署在节点,且对应用无侵入。缺点是只适合容器内应用日志必须都是标准输出。</li>
</ul>
<p><img src="https://cdm.yp14.cn/img1/v2-0a107ec683ae03521776c23be62b6cd3_1440w.jpg" alt=""></p>
<ul>
<li>使用<code>sidecar container</code>作为容器日志代理,也就是在pod中跟随应用容器起一个日志处理容器,有两种形式:</li>
</ul>
<p><code>一种</code>是直接将应用容器的日志收集并输出到标准输出(叫做<code>Streaming sidecar container</code>),但需要注意的是,这时候,宿主机上实际上会存在两份相同的日志文件:一份是应用自己写入的;另一份则是 sidecar 的 stdout 和 stderr 对应的 JSON 文件。这对磁盘是很大的浪费 , 所以说,除非万不得已或者应用容器完全不可能被修改。</p>
<p><img src="https://cdm.yp14.cn/img1/v2-927c50e278093e73235a2c2c5beb8cee_1440w.jpg" alt=""></p>
<p><code>另一种</code>是每一个pod中都起一个<code>日志收集agent</code>(比如logstash或fluebtd)也就是相当于把方案一里的 logging agent放在了pod里。但是这种方案资源消耗(cpu,内存)较大,并且日志不会输出到标准输出,kubectl logs 会看不到日志内容。</p>
<p><img src="https://cdm.yp14.cn/img1/v2-17b0e238392c5389fa7efebbaddd35f1_1440w.jpg" alt=""></p>
<p><code>应用容器中直接将日志推到存储后端</code>,这种方式就比较简单了,直接在应用里面将日志内容发送到日志收集服务后端。</p>
<p><img src="https://cdm.yp14.cn/img1/v2-def4b53da69a5160e134bb8313ab2cfb_1440w.jpg" alt=""></p>
<h2 id="日志架构"><a href="#日志架构" class="headerlink" title="日志架构"></a>日志架构</h2><p>通过上文对k8s日志收集方案的介绍,要想设计一个统一的日志收集系统,可以采用节点代理方式收集每个节点上容器的日志,日志的整体架构如图所示。</p>
<p><img src="https://cdm.yp14.cn/img1/v2-ceb52f7badffa061440e58139a97df8f_1440w.jpg" alt=""></p>
<p>解释如下:</p>
<ul>
<li>1、所有应用容器都是基于s6基底镜像的,容器应用日志都会重定向到宿主机的某个目录文件下比如/data/logs/namespace/appname/podname/log/xxxx.log</li>
<li>2、log-agent 内部 包含 filebeat ,logrotate 等工具,其中filebeat是作为日志文件收集的agent</li>
<li>3、通过filebeat将收集的日志发送到kafka</li>
<li>4、kafka在讲日志发送的es日志存储/kibana检索层</li>
<li>5、logstash 作为中间工具主要用来在es中创建index和消费kafka 的消息</li>
</ul>
<p>整个流程很好理解,但是需要解决的是</p>
<ul>
<li>1、用户部署的新应用,如何动态更新filebeat配置,</li>
<li>2、如何保证每个日志文件都被正常的rotate,</li>
<li>3、如果需要更多的功能则需要二次开发filebeat,使filebeat 支持更多的自定义配置。</li>
</ul>
<h2 id="付诸实践"><a href="#付诸实践" class="headerlink" title="付诸实践"></a>付诸实践</h2><p>解决上述问题,就需要开发一个log-agent应用以daemonset形式运行在k8s集群的每个节点上,应用内部包含filebeat,logrotate,和需要开发的功能组件。</p>
<p>第一个问题,如何动态更新filebeat配置,可以利用<code>http://github.com/fsnotify/fsnotify</code> 工具包监听日志目录变化create、delete事件,利用模板渲染的方法更新filebeat配置文件</p>
<p>第二个问题,利用<code>http://github.com/robfig/cron</code> 工具包 创建cronJob,定期rotate日志文件,注意应用日志文件所属用户,如果不是root用户所属,可以在配置中设置切换用户</p>
<figure class="highlight"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">/var/log/xxxx/xxxxx.log {</span><br><span class="line"> su www-data www-data</span><br><span class="line"> missingok</span><br><span class="line"> notifempty</span><br><span class="line"> size 1G</span><br><span class="line"> copytruncate</span><br><span class="line"> }</span><br></pre></td></tr></table></figure>
<p>第三个问题,关于二次开发filebeat,可以参考博文 <code>https://www.jianshu.com/p/fe3ac68f4</code></p>
<h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>本文只是对k8s日志收集提供了一个<code>简单的思路</code>,关于日志收集可以根据公司的需求,因地制宜。</p>
<h2 id="参考文献"><a href="#参考文献" class="headerlink" title="参考文献"></a>参考文献</h2><ul>
<li>1、kubernetes.io/docs/concepts/cluster-administration/logging/</li>
<li>2、<a href="https://support.rackspace.com/how-to/understanding-logrotate-utility/" target="_blank" rel="external">https://support.rackspace.com/how-to/understanding-logrotate-utility/</a></li>
<li>3、<a href="https://github.com/elastic/beats/tree/master/filebeat" target="_blank" rel="external">https://github.com/elastic/beats/tree/master/filebeat</a></li>
<li>4、<a href="http://skarnet.org/software/s6/" target="_blank" rel="external">http://skarnet.org/software/s6/</a></li>
</ul>
<blockquote>
<ul>
<li>作者:知道又忘了</li>
<li>原文出处:<a href="https://zhuanlan.zhihu.com/p/70662744" target="_blank" rel="external">https://zhuanlan.zhihu.com/p/70662744</a></li>
</ul>
</blockquote>
<h2 id="关于容器日志"><a href="#关于容器日志" class="headerlink" title="关于容器日志"></a>关于容器日志</h2><p>Docker的日志分为两类,一类是 Docker引擎日志;另一类是容器日志。引擎日志一般都交给了系统日志,不同的操作系统会放在不同的位置。本文主要介绍容器日志,容器日志可以理解是运行在容器内部的应用输出的日志,默认情况下,docker logs 显示当前运行的容器的日志信息,内容包含 STOUT(标准输出) 和 STDERR(标准错误输出)。日志都会以 json-file 的格式存储于 <code>/var/lib/docker/containers/<容器id>/<容器id>-json.log</code> ,不过这种方式并不适合放到生产环境中。</p>
<ul>
<li>默认方式下容器日志并不会限制日志文件的大小,容器会一直写日志,导致磁盘爆满,影响系统应用。(docker log-driver 支持log文件的rotate)</li>
<li>Docker Daemon 收集容器的标准输出,当日志量过大时会导致Docker Daemon 成为日志收集的瓶颈,日志的收集速度受限。</li>
<li>日志文件量过大时,利用docker logs -f 查看时会直接将Docker Daemon阻塞住,造成docker ps等命令也不响应。</li>
</ul>
<p>Docker提供了logging drivers配置,用户可以根据自己的需求去配置不同的log-driver,可参考官网 <a href="https//docs.docker.com/v17.09/engine/admin/logging/overview/">Configure logging drivers</a> 。但是上述配置的日志收集也是通过Docker Daemon收集,收集日志的速度依然是瓶颈。</p>
<blockquote>
<ul>
<li>log-driver 日志收集速度</li>
<li>syslog 14.9 MB/s</li>
<li>json-file 37.9 MB/s</li>
</ul>
</blockquote>
<p>能不能找到不通过Docker Daemon收集日志直接将日志内容重定向到文件并自动 rotate的工具呢?答案是肯定的采用<code>S6</code>基底镜像。</p>
<p>S6-log 将 CMD 的标准输出重定向到/…/default/current,而不是发送到 Docker Daemon,这样就避免了 Docker Daemon 收集日志的性能瓶颈。本文就是采用<code>S6</code>基底镜像构建应用镜像形成统一日志收集方案。</p>
<blockquote>
<p>S6:<a href="http://skarnet.org/software/s6/">http://skarnet.org/software/s6/</a></p>
</blockquote>