当前位置 博文首页 > 请叫我头头哥:SpringBoot进阶教程(七十)SkyWalking

    请叫我头头哥:SpringBoot进阶教程(七十)SkyWalking

    作者:请叫我头头哥 时间:2021-02-13 22:30

    请叫我头头哥流行的APM(Application Performance Management工具有很多,比如Cat、Zipkin、Pinpoint、SkyWalking,优秀的监控工具还有很多,其它比如还有zabbix、prometheus、Arthas、Grafana之类的。这里主要介绍SkyWalking,它是一款优秀的国产APM工具,包括了分布式追踪、性能指标分析、应用和服务依赖分析等。Skywalking是一个分布式系统的应用程序性能监视工具,专为微服务、云原生架构和基于容器(Docker、K8s、Mesos)架构而设计。SkyWalking 是观察性分析平台和应用性能管理系统。提供分布式追踪、服务网格遥测分析、度量聚合和可视化一体化解决方案。

    流行的APM(Application Performance Management工具有很多,比如Cat、Zipkin、Pinpoint、SkyWalking。优秀的监控工具还有很多,其它比如还有zabbix、prometheus、Arthas、Grafana之类的。这里主要介绍SkyWalking,它是一款优秀的国产APM工具,包括了分布式追踪、性能指标分析、应用和服务依赖分析等。

    Skywalking是一个分布式系统的应用程序性能监视工具,专为微服务、云原生架构和基于容器(Docker、K8s、Mesos)架构而设计。SkyWalking 是观察性分析平台和应用性能管理系统。提供分布式追踪、服务网格遥测分析、度量聚合和可视化一体化解决方案。支持Java, .Net Core, PHP, NodeJS, Golang, LUA语言探针,支持Envoy + Istio构建的Service Mesh。

    v介绍

    1.1 SkyWalking 在逻辑上分为四部分:探针、平台后端、存储和用户界面。其架构图如下:

    SpringBoot进阶教程(七十)SkyWalking

    图片来源于网络,侵删。

    探针:基于不同的来源探针可能是不一样的,但作用都是收集数据,将数据格式化为SkyWalking适用的格式。例如在Java中则是做字节码植入,无侵入式的收集,并通过HTTP或者GRPC方式发送数据到平台后端.

    平台后端:是一个支持集群模式运行的后台,用于数据聚合、数据分析以及驱动数据流从探针到用户界面的流程。平台后端还提供了各种可插拔的能力,如不同来源数据(如来自 Zipkin)格式化,不同存储系统以及集群管理。你甚至还可以使用观测分析语言来进行自定义聚合分析。

    存储:是开放式的,可以选择一个既有的存储系统,如ElasticSearch、H2 或 MySQL 集群(Sharding-Sphere 管理),也可以选择自己实现一个存储系统。

    用户界面:SkyWalking的可视化界面,UI非常炫酷且强大,同样它也是可定制以匹配你已存在的后端的。

    1.2 在SkyWalking中也存在服务、服务实例及端点概念,因为SkyWalking就是提供了这些概念的观测能力:

    服务(Service):表示对请求提供相同行为的一系列或一组工作负载。在使用打点代理或 SDK 的时候,你可以定义服务的名字。如果不定义的话,SkyWalking 将会使用你在平台上定义的名字,如 Istio。

    服务实例(Service Instance):上述的一组工作负载中的每一个工作负载称为一个实例。就像 Kubernetes 中的 pods 一样,服务实例未必就是操作系统上的一个进程。但当你在使用打点代理的时候, 一个服务实例实际就是操作系统上的一个真实进程。

    端点(Endpoint):对于特定服务所接收的请求路径,如 HTTP 的 URI 路径和 gRPC 服务的类名 + 方法签名

    1.3 SkyWalking的优势如下:

    1.多种监控手段,语言探针和服务网格(Service Mesh)

    2.模块化,UI、存储、集群管理多种机制可选

    3.支持告警(告警可以推送到钉钉)

    4.优秀的可视化方案

    v环境准备

    2.1 拉取镜像

    docker pull elasticsearch:7.5.1
    docker pull apache/skywalking-oap-server:8.3.0-es7
    docker pull apache/skywalking-ui:8.3.0

    2.2 创建&启动elasticsearch

    docker run -d --name=es7 \
    -p 9200:9200 -p 9300:9300 \
    -e "discovery.type=single-node" elasticsearch:7.5.1

    注意:若创建es持久化目录,则按下面的命令执行。

    mkdir -p /data/elasticsearch
    docker cp es7:/usr/share/elasticsearch/data /data/elasticsearch/
    docker cp es7:/usr/share/elasticsearch/logs /data/elasticsearch/
    docker rm -f es7
    docker run -d --name=es7 \
      --restart=always \
      -p 9200:9200 -p 9300:9300 \
      -e "discovery.type=single-node" \
      -v /data/elasticsearch/data:/usr/share/elasticsearch/data \
      -v /data/elasticsearch/logs:/usr/share/elasticsearch/logs \
    elasticsearch:7.5.1

    2.3 创建&启动OAP

    docker run --name oap --restart always -d \
    --restart=always \
    -e TZ=Asia/Shanghai \
    -p 12800:12800 \
    -p 11800:11800 \
    --link es7:es7 \
    -e SW_STORAGE=elasticsearch7 \
    -e SW_STORAGE_ES_CLUSTER_NODES=es7:9200 \
    apache/skywalking-oap-server:8.3.0-es7

    SW_STORAGE:表示选择elasticsearch7作为存储组件

    SW_STORAGE_ES_CLUSTER_NODES:elasticsearch的节点,多个用逗号隔开

    以上参数均为application.yml文件中的参数。

    application.yml详细信息如下:

    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    cluster:
      selector: ${SW_CLUSTER:standalone}
      standalone:
      # Please check your ZooKeeper is 3.5+, However, it is also compatible with ZooKeeper 3.4.x. Replace the ZooKeeper 3.5+
      # library the oap-libs folder with your ZooKeeper 3.4.x library.
      zookeeper:
        nameSpace: ${SW_NAMESPACE:""}
        hostPort: ${SW_CLUSTER_ZK_HOST_PORT:localhost:2181}
        # Retry Policy
        baseSleepTimeMs: ${SW_CLUSTER_ZK_SLEEP_TIME:1000} # initial amount of time to wait between retries
        maxRetries: ${SW_CLUSTER_ZK_MAX_RETRIES:3} # max number of times to retry
        # Enable ACL
        enableACL: ${SW_ZK_ENABLE_ACL:false} # disable ACL in default
        schema: ${SW_ZK_SCHEMA:digest} # only support digest schema
        expression: ${SW_ZK_EXPRESSION:skywalking:skywalking}
      kubernetes:
        namespace: ${SW_CLUSTER_K8S_NAMESPACE:default}
        labelSelector: ${SW_CLUSTER_K8S_LABEL:app=collector,release=skywalking}
        uidEnvName: ${SW_CLUSTER_K8S_UID:SKYWALKING_COLLECTOR_UID}
      consul:
        serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}
        # Consul cluster nodes, example: 10.0.0.1:8500,10.0.0.2:8500,10.0.0.3:8500
        hostPort: ${SW_CLUSTER_CONSUL_HOST_PORT:localhost:8500}
        aclToken: ${SW_CLUSTER_CONSUL_ACLTOKEN:""}
      etcd:
        serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}
        # etcd cluster nodes, example: 10.0.0.1:2379,10.0.0.2:2379,10.0.0.3:2379
        hostPort: ${SW_CLUSTER_ETCD_HOST_PORT:localhost:2379}
      nacos:
        serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}
        hostPort: ${SW_CLUSTER_NACOS_HOST_PORT:localhost:8848}
        # Nacos Configuration namespace
        namespace: ${SW_CLUSTER_NACOS_NAMESPACE:"public"}
        # Nacos auth username
        username: ${SW_CLUSTER_NACOS_USERNAME:""}
        password: ${SW_CLUSTER_NACOS_PASSWORD:""}
        # Nacos auth accessKey
        accessKey: ${SW_CLUSTER_NACOS_ACCESSKEY:""}
        secretKey: ${SW_CLUSTER_NACOS_SECRETKEY:""}
    core:
      selector: ${SW_CORE:default}
      default:
        # Mixed: Receive agent data, Level 1 aggregate, Level 2 aggregate
        # Receiver: Receive agent data, Level 1 aggregate
        # Aggregator: Level 2 aggregate
        role: ${SW_CORE_ROLE:Mixed} # Mixed/Receiver/Aggregator
        restHost: ${SW_CORE_REST_HOST:0.0.0.0}
        restPort: ${SW_CORE_REST_PORT:12800}
        restContextPath: ${SW_CORE_REST_CONTEXT_PATH:/}
        restMinThreads: ${SW_CORE_REST_JETTY_MIN_THREADS:1}
        restMaxThreads: ${SW_CORE_REST_JETTY_MAX_THREADS:200}
        restIdleTimeOut: ${SW_CORE_REST_JETTY_IDLE_TIMEOUT:30000}
        restAcceptorPriorityDelta: ${SW_CORE_REST_JETTY_DELTA:0}
        restAcceptQueueSize: ${SW_CORE_REST_JETTY_QUEUE_SIZE:0}
        gRPCHost: ${SW_CORE_GRPC_HOST:0.0.0.0}
        gRPCPort: ${SW_CORE_GRPC_PORT:11800}
        maxConcurrentCallsPerConnection: ${SW_CORE_GRPC_MAX_CONCURRENT_CALL:0}
        maxMessageSize: ${SW_CORE_GRPC_MAX_MESSAGE_SIZE:0}
        gRPCThreadPoolQueueSize: ${SW_CORE_GRPC_POOL_QUEUE_SIZE:-1}
        gRPCThreadPoolSize: ${SW_CORE_GRPC_THREAD_POOL_SIZE:-1}
        gRPCSslEnabled: ${SW_CORE_GRPC_SSL_ENABLED:false}
        gRPCSslKeyPath: ${SW_CORE_GRPC_SSL_KEY_PATH:""}
        gRPCSslCertChainPath: ${SW_CORE_GRPC_SSL_CERT_CHAIN_PATH:""}
        gRPCSslTrustedCAPath: ${SW_CORE_GRPC_SSL_TRUSTED_CA_PATH:""}
        downsampling:
          - Hour
          - Day
        # Set a timeout on metrics data. After the timeout has expired, the metrics data will automatically be deleted.
        enableDataKeeperExecutor: ${SW_CORE_ENABLE_DATA_KEEPER_EXECUTOR:true} # Turn it off then automatically metrics data delete will be close.
        dataKeeperExecutePeriod: ${SW_CORE_DATA_KEEPER_EXECUTE_PERIOD:5} # How often the data keeper executor runs periodically, unit is minute
        recordDataTTL: ${SW_CORE_RECORD_DATA_TTL:3} # Unit is day
        metricsDataTTL: ${SW_CORE_METRICS_DATA_TTL:7} # Unit is day
        # Cache metrics data for 1 minute to reduce database queries, and if the OAP cluster changes within that minute,
        # the metrics may not be accurate within that minute.
        enableDatabaseSession: ${SW_CORE_ENABLE_DATABASE_SESSION:true}
        topNReportPeriod: ${SW_CORE_TOPN_REPORT_PERIOD:10} # top_n record worker report cycle, unit is minute
        # Extra model column are the column defined by in the codes, These columns of model are not required logically in aggregation or further query,
        # and it will cause more load for memory, network of OAP and storage.
        # But, being activated, user could see the name in the storage entities, which make users easier to use 3rd party tool, such as Kibana->ES, to query the data by themselves.
        activeExtraModelColumns: ${SW_CORE_ACTIVE_EXTRA_MODEL_COLUMNS:false}
        # The max length of service + instance names should be less than 200
        serviceNameMaxLength: ${SW_SERVICE_NAME_MAX_LENGTH:70}
        instanceNameMaxLength: ${SW_INSTANCE_NAME_MAX_LENGTH:70}
        # The max length of service + endpoint names should be less than 240
        endpointNameMaxLength: ${SW_ENDPOINT_NAME_MAX_LENGTH:150}
        # Define the set of span tag keys, which should be searchable through the GraphQL.
        searchableTracesTags: ${SW_SEARCHABLE_TAG_KEYS:http.method,status_code,db.type,db.instance,mq.queue,mq.topic,mq.broker}
    storage:
      selector: ${SW_STORAGE:h2}
      elasticsearch:
        nameSpace: ${SW_NAMESPACE:""}
        clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}
        protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
        user: ${SW_ES_USER:""}
        password: ${SW_ES_PASSWORD:""}
        trustStorePath: ${SW_STORAGE_ES_SSL_JKS_PATH:""}
        trustStorePass: ${SW_STORAGE_ES_SSL_JKS_PASS:""}
        secretsManagementFile: ${SW_ES_SECRETS_MANAGEMENT_FILE:""} # Secrets management file in the properties format includes the username, password, which are managed by 3rd party tool.
        dayStep: ${SW_STORAGE_DAY_STEP:1} # Represent the number of days in the one minute/hour/day index.
        indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:1} # Shard number of new indexes
        indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:1} # Replicas number of new indexes
        # Super data set has been defined in the codes, such as trace segments.The following 3 config would be improve es performance when storage super size data in es.
        superDatasetDayStep: ${SW_SUPERDATASET_STORAGE_DAY_STEP:-1} # Represent the number of days in the super size dataset record index, the default value is the same as dayStep when the value is less than 0
        superDatasetIndexShardsFactor: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_SHARDS_FACTOR:5} #  This factor provides more shards for the super data set, shards number = indexShardsNumber * superDatasetIndexShardsFactor. Also, this factor effects Zipkin and Jaeger traces.
        superDatasetIndexReplicasNumber: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_REPLICAS_NUMBER:0} # Represent the replicas number in the super size dataset record index, the default value is 0.
        bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:1000} # Execute the async bulk record data every ${SW_STORAGE_ES_BULK_ACTIONS} requests
        syncBulkActions: ${SW_STORAGE_ES_SYNC_BULK_ACTIONS:50000} # Execute the sync bulk metrics data every ${SW_STORAGE_ES_SYNC_BULK_ACTIONS} requests
        flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests
        concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
        resultWindowMaxSize: ${SW_STORAGE_ES_QUERY_MAX_WINDOW_SIZE:10000}
        metadataQueryMaxSize: ${SW_STORAGE_ES_QUERY_MAX_SIZE:5000}
        segmentQueryMaxSize: ${SW_STORAGE_ES_QUERY_SEGMENT_SIZE:200}
        profileTaskQueryMaxSize: ${SW_STORAGE_ES_QUERY_PROFILE_TASK_SIZE:200}
        advanced: ${SW_STORAGE_ES_ADVANCED:""}
      elasticsearch7:
        nameSpace: ${SW_NAMESPACE:""}
        clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}
        protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
        trustStorePath: ${SW_STORAGE_ES_SSL_JKS_PATH:""}
        trustStorePass: ${SW_STORAGE_ES_SSL_JKS_PASS:""}
        dayStep: ${SW_STORAGE_DAY_STEP:1} # Represent the number of days in the one minute/hour/day index.
        indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:1} # Shard number of new indexes
        indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:1} # Replicas number of new indexes
        # Super data set has been defined in the codes, such as trace segments.The following 3 config would be improve es performance when storage super size data in es.
        superDatasetDayStep: ${SW_SUPERDATASET_STORAGE_DAY_STEP:-1} # Represent the number of days in the super size dataset record index, the default value is the same as dayStep when the value is less than 0
        superDatasetIndexShardsFactor: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_SHARDS_FACTOR:5} #  This factor provides more shards for the super data set, shards number = indexShardsNumber * superDatasetIndexShardsFactor. Also, this factor effects Zipkin and Jaeger traces.
        superDatasetIndexReplicasNumber: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_REPLICAS_NUMBER:0} # Represent the replicas number in the super size dataset record index, the default value is 0.
        user: ${SW_ES_USER:""}
        password: ${SW_ES_PASSWORD:""}
        secretsManagementFile: ${SW_ES_SECRETS_MANAGEMENT_FILE:""} # Secrets management file in the properties format includes the username, password, which are managed by 3rd party tool.
        bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:1000} # Execute the async bulk record data every ${SW_STORAGE_ES_BULK_ACTIONS} requests
        syncBulkActions: ${SW_STORAGE_ES_SYNC_BULK_ACTIONS:50000} # Execute the sync bulk metrics data every ${SW_STORAGE_ES_SYNC_BULK_ACTIONS} requests
        flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests
        concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
        resultWindowMaxSize: ${SW_STORAGE_ES_QUERY_MAX_WINDOW_SIZE:10000}
        metadataQueryMaxSize: ${SW_STORAGE_ES_QUERY_MAX_SIZE:5000}
        segmentQueryMaxSize: ${SW_STORAGE_ES_QUERY_SEGMENT_SIZE:200}
        profileTaskQueryMaxSize: ${SW_STORAGE_ES_QUERY_PROFILE_TASK_SIZE:200}
        advanced: ${SW_STORAGE_ES_ADVANCED:""}
      h2:
        driver: ${SW_STORAGE_H2_DRIVER:org.h2.jdbcx.JdbcDataSource}
        url: ${SW_STORAGE_H2_URL:jdbc:h2:mem:skywalking-oap-db}
        user: ${SW_STORAGE_H2_USER:sa}
        metadataQueryMaxSize: ${SW_STORAGE_H2_QUERY_MAX_SIZE:5000}
        maxSizeOfArrayColumn: ${SW_STORAGE_MAX_SIZE_OF_ARRAY_COLUMN:20}
        numOfSearchableValuesPerTag: ${SW_STORAGE_NUM_OF_SEARCHABLE_VALUES_PER_TAG:2}
      mysql:
        properties:
          jdbcUrl: ${SW_JDBC_URL:"jdbc:mysql://localhost:3306/swtest"}
          dataSource.user: ${SW_DATA_SOURCE_USER:root}
          dataSource.password: ${SW_DATA_SOURCE_PASSWORD:root@1234}
          dataSource.cachePrepStmts: ${SW_DATA_SOURCE_CACHE_PREP_STMTS:true}
          dataSource.prepStmtCacheSize: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_SIZE:250}
          dataSource.prepStmtCacheSqlLimit: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_LIMIT:2048}
          dataSource.useServerPrepStmts: ${SW_DATA_SOURCE_USE_SERVER_PREP_STMTS:true}
        metadataQueryMaxSize: ${SW_STORAGE_MYSQL_QUERY_MAX_SIZE:5000}
        maxSizeOfArrayColumn: ${SW_STORAGE_MAX_SIZE_OF_ARRAY_COLUMN:20}
        numOfSearchableValuesPerTag: ${SW_STORAGE_NUM_OF_SEARCHABLE_VALUES_PER_TAG:2}
      tidb:
        properties:
          jdbcUrl: ${SW_JDBC_URL:"jdbc:mysql://localhost:4000/tidbswtest"}
          dataSource.user: ${SW_DATA_SOURCE_USER:root}
          dataSource.password: ${SW_DATA_SOURCE_PASSWORD:""}
          dataSource.cachePrepStmts: ${SW_DATA_SOURCE_CACHE_PREP_STMTS:true}
          dataSource.prepStmtCacheSize: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_SIZE:250}
          dataSource.prepStmtCacheSqlLimit: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_LIMIT:2048}
          dataSource.useServerPrepStmts: ${SW_DATA_SOURCE_USE_SERVER_PREP_STMTS:true}
          dataSource.useAffectedRows: ${SW_DATA_SOURCE_USE_AFFECTED_ROWS:true}
        metadataQueryMaxSize: ${SW_STORAGE_MYSQL_QUERY_MAX_SIZE:5000}
        maxSizeOfArrayColumn: ${SW_STORAGE_MAX_SIZE_OF_ARRAY_COLUMN:20}
        numOfSearchableValuesPerTag: ${SW_STORAGE_NUM_OF_SEARCHABLE_VALUES_PER_TAG:2}
      influxdb:
        # InfluxDB configuration
        url: ${SW_STORAGE_INFLUXDB_URL:http://localhost:8086}
        user: ${SW_STORAGE_INFLUXDB_USER:root}
        password: ${SW_STORAGE_INFLUXDB_PASSWORD:}
        database: ${SW_STORAGE_INFLUXDB_DATABASE:skywalking}
        actions: ${SW_STORAGE_INFLUXDB_ACTIONS:1000} # the number of actions to collect
        duration: ${SW_STORAGE_INFLUXDB_DURATION:1000} # the time to wait at most (milliseconds)
        batchEnabled: ${SW_STORAGE_INFLUXDB_BATCH_ENABLED:true}
        fetchTaskLogMaxSize: ${SW_STORAGE_INFLUXDB_FETCH_TASK_LOG_MAX_SIZE:5000} # the max number of fetch task log in a request
    
    agent-analyzer:
      selector: ${SW_AGENT_ANALYZER:default}
      default:
        sampleRate: ${SW_TRACE_SAMPLE_RATE:10000} # The sample rate precision is 1/10000. 10000 means 100% sample in default.
        slowDBAccessThreshold: ${SW_SLOW_DB_THRESHOLD:default:200,mongodb:100} # The slow database access thresholds. Unit ms.
        forceSampleErrorSegment: ${SW_FORCE_SAMPLE_ERROR_SEGMENT:true} # When sampling mechanism active, this config can open(true) force save some error segment. true is default.
        segmentStatusAnalysisStrategy: ${SW_SEGMENT_STATUS_ANALYSIS_STRATEGY:FROM_SPAN_STATUS} # Determine the final segment status from the status of spans. Available values are `FROM_SPAN_STATUS` , `FROM_ENTRY_SPAN` and `FROM_FIRST_SPAN`. `FROM_SPAN_STATUS` represents the segment status would be error if any span is in error status. `FROM_ENTRY_SPAN` means the segment status would be determined by the status of entry spans only. `FROM_FIRST_SPAN` means the segment status would be determined by the status of the first span only.
        # Nginx and Envoy agents can't get the real remote address.
        # Exit spans with the component in the list would not generate the client-side instance relation metrics.
        noUpstreamRealAddressAgents: ${SW_NO_UPSTREAM_REAL_ADDRESS:6000,9000}
        slowTraceSegmentThreshold: ${SW_SLOW_TRACE_SEGMENT_THRESHOLD:-1} # Setting this threshold about the latency would make the slow trace segments sampled if they cost more time, even the sampling mechanism activated. The default value is `-1`, which means would not sample slow traces. Unit, millisecond.
        meterAnalyzerActiveFiles: ${SW_METER_ANALYZER_ACTIVE_FILES:} # Which files could be meter analyzed, files split by ","
    
    receiver-sharing-server:
      selector: ${SW_RECEIVER_SHARING_SERVER:default}
      default:
        # For Jetty server
        restHost: ${SW_RECEIVER_SHARING_REST_HOST:0.0.0.0}
        restPort: ${SW_RECEIVER_SHARING_REST_PORT:0}
        contextPath: ${SW_RECEIVER_SHARING_REST_CONTEXT_PATH:/}
        restMinThreads: ${SW_RECEIVER_SHARING_JETTY_MIN_THREADS:1}
        restMaxThreads: ${SW_RECEIVER_SHARING_JETTY_MAX_THREADS:200}
        restIdleTimeOut: ${SW_RECEIVER_SHARING_JETTY_IDLE_TIMEOUT:30000}
        restAcceptorPriorityDelta: ${SW_RECEIVER_SHARING_JETTY_DELTA:0}
        restAcceptQueueSize: ${SW_RECEIVER_SHARING_JETTY_QUEUE_SIZE:0}
        # For gRPC server
        gRPCHost: ${SW_RECEIVER_GRPC_HOST:0.0.0.0}
        gRPCPort: ${SW_RECEIVER_GRPC_PORT:0}
        maxConcurrentCallsPerConnection: ${SW_RECEIVER_GRPC_MAX_CONCURRENT_CALL:0}
        maxMessageSize: ${SW_RECEIVER_GRPC_MAX_MESSAGE_SIZE:0}
        gRPCThreadPoolQueueSize: ${SW_RECEIVER_GRPC_POOL_QUEUE_SIZE:0}
        gRPCThreadPoolSize: ${SW_RECEIVER_GRPC_THREAD_POOL_SIZE:0}
        gRPCSslEnabled: ${SW_RECEIVER_GRPC_SSL_ENABLED:false}
        gRPCSslKeyPath: ${SW_RECEIVER_GRPC_SSL_KEY_PATH:""}
        gRPCSslCertChainPath: ${SW_RECEIVER_GRPC_SSL_CERT_CHAIN_PATH:""}
        authentication: ${SW_AUTHENTICATION:""}
    receiver-register:
      selector: ${SW_RECEIVER_REGISTER:default}
      default:
    
    receiver-trace:
      selector: ${SW_RECEIVER_TRACE:default}
      default:
    
    receiver-jvm:
      selector: ${SW_RECEIVER_JVM:default}
      default:
    
    receiver-clr:
      selector: ${SW_RECEIVER_CLR:default}
      default:
    
    receiver-profile:
      selector: ${SW_RECEIVER_PROFILE:default}
      default:
    
    service-mesh:
      selector: ${SW_SERVICE_MESH:default}
      default:
    
    envoy-metric:
      selector: ${SW_ENVOY_METRIC:default}
      default:
        acceptMetricsService: ${SW_ENVOY_METRIC_SERVICE:true}
        alsHTTPAnalysis: ${SW_ENVOY_METRIC_ALS_HTTP_ANALYSIS:""}
        # `k8sServiceNameRule` allows you to customize the service name in ALS via Kubernetes metadata,
        # the available variables are `pod`, `service`, f.e., you can use `${service.metadata.name}-${pod.metadata.labels.version}`
        # to append the version number to the service name.
        # Be careful, when using environment variables to pass this configuration, use single quotes(`''`) to avoid it being evaluated by the shell.
        k8sServiceNameRule: ${K8S_SERVICE_NAME_RULE:"${service.metadata.name}"}
    
    prometheus-fetcher:
      selector: ${SW_PROMETHEUS_FETCHER:-}
      default:
        enabledRules: ${SW_PROMETHEUS_FETCHER_ENABLED_RULES:"self"}
    
    kafka-fetcher:
      selector: ${SW_KAFKA_FETCHER:-}
      default:
        bootstrapServers: ${SW_KAFKA_FETCHER_SERVERS:localhost:9092}
        partitions: ${SW_KAFKA_FETCHER_PARTITIONS:3}
        replicationFactor: ${SW_KAFKA_FETCHER_PARTITIONS_FACTOR:2}
        enableMeterSystem: ${SW_KAFKA_FETCHER_ENABLE_METER_SYSTEM:false}
        isSharding: ${SW_KAFKA_FETCHER_IS_SHARDING:false}
        consumePartitions: ${SW_KAFKA_FETCHER_CONSUME_PARTITIONS:""}
        kafkaHandlerThreadPoolSize: ${SW_KAFKA_HANDLER_THREAD_POOL_SIZE:-1}
        kafkaHandlerThreadPoolQueueSize: ${SW_KAFKA_HANDLER_THREAD_POOL_QUEUE_SIZE:-1}
    
    receiver-meter:
      selector: ${SW_RECEIVER_METER:default}
      default:
    
    receiver-otel:
      selector: ${SW_OTEL_RECEIVER:-}
      default:
        enabledHandlers: ${SW_OTEL_RECEIVER_ENABLED_HANDLERS:"oc"}
        enabledOcRules: ${SW_OTEL_RECEIVER_ENABLED_OC_RULES:"istio-controlplane"}
    
    receiver_zipkin:
      selector: ${SW_RECEIVER_ZIPKIN:-}
      default:
        host: ${SW_RECEIVER_ZIPKIN_HOST:0.0.0.0}
        port: ${SW_RECEIVER_ZIPKIN_PORT:9411}
        contextPath: ${SW_RECEIVER_ZIPKIN_CONTEXT_PATH:/}
        jettyMinThreads: ${SW_RECEIVER_ZIPKIN_JETTY_MIN_THREADS:1}
        jettyMaxThreads: ${SW_RECEIVER_ZIPKIN_JETTY_MAX_THREADS:200}
        jettyIdleTimeOut: ${SW_RECEIVER_ZIPKIN_JETTY_IDLE_TIMEOUT:30000}
        jettyAcceptorPriorityDelta: ${SW_RECEIVER_ZIPKIN_JETTY_DELTA:0}
        jettyAcceptQueueSize: ${SW_RECEIVER_ZIPKIN_QUEUE_SIZE:0}
    
    receiver_jaeger:
      selector: ${SW_RECEIVER_JAEGER:-}
      default:
        gRPCHost: ${SW_RECEIVER_JAEGER_HOST:0.0.0.0}
        gRPCPort: ${SW_RECEIVER_JAEGER_PORT:14250}
    
    receiver-browser:
      selector: ${SW_RECEIVER_BROWSER:default}
      default:
        # The sample rate precision is 1/10000. 10000 means 100% sample in default.
        sampleRate: ${SW_RECEIVER_BROWSER_SAMPLE_RATE:10000}
    
    query:
      selector: ${SW_QUERY:graphql}
      graphql:
        path: ${SW_QUERY_GRAPHQL_PATH:/graphql}
    
    alarm:
      selector: ${SW_ALARM:default}
      default:
    
    telemetry:
      selector: ${SW_TELEMETRY:none}
      none:
      prometheus:
        host: ${SW_TELEMETRY_PROMETHEUS_HOST:0.0.0.0}
        port: ${SW_TELEMETRY_PROMETHEUS_PORT:1234}
        sslEnabled: ${SW_TELEMETRY_PROMETHEUS_SSL_ENABLED:false}
        sslKeyPath: ${SW_TELEMETRY_PROMETHEUS_SSL_KEY_PATH:""}
        sslCertChainPath: ${SW_TELEMETRY_PROMETHEUS_SSL_CERT_CHAIN_PATH:""}
    
    configuration:
      selector: ${SW_CONFIGURATION:none}
      none:
      grpc:
        host: ${SW_DCS_SERVER_HOST:""}
        port: ${SW_DCS_SERVER_PORT:80}
        clusterName: ${SW_DCS_CLUSTER_NAME:SkyWalking}
        period: ${SW_DCS_PERIOD:20}
      apollo:
        apolloMeta: ${SW_CONFIG_APOLLO:http://localhost:8080}
        apolloCluster: ${SW_CONFIG_APOLLO_CLUSTER:default}
        apolloEnv: ${SW_CONFIG_APOLLO_ENV:""}
        appId: ${SW_CONFIG_APOLLO_APP_ID:skywalking}
        period: ${SW_CONFIG_APOLLO_PERIOD:5}
      zookeeper:
        period: ${SW_CONFIG_ZK_PERIOD:60} # Unit seconds, sync period. Default fetch every 60 seconds.
        nameSpace: ${SW_CONFIG_ZK_NAMESPACE:/default}
        hostPort: ${SW_CONFIG_ZK_HOST_PORT:localhost:2181}
        # Retry Policy
        baseSleepTimeMs: ${SW_CONFIG_ZK_BASE_SLEEP_TIME_MS:1000} # initial amount of time to wait between retries
        maxRetries: ${SW_CONFIG_ZK_MAX_RETRIES:3} # max number of times to retry
      etcd:
        period: ${SW_CONFIG_ETCD_PERIOD:60} # Unit seconds, sync period. Default fetch every 60 seconds.
        group: ${SW_CONFIG_ETCD_GROUP:skywalking}
        serverAddr: ${SW_CONFIG_ETCD_SERVER_ADDR:localhost:2379}
        clusterName: ${SW_CONFIG_ETCD_CLUSTER_NAME:default}
      consul:
        # Consul host and ports, separated by comma, e.g. 1.2.3.4:8500,2.3.4.5:8500
        hostAndPorts: ${SW_CONFIG_CONSUL_HOST_AND_PORTS:1.2.3.4:8500}
        # Sync period in seconds. Defaults to 60 seconds.
        period: ${SW_CONFIG_CONSUL_PERIOD:60}
        # Consul aclToken
        aclToken: ${SW_CONFIG_CONSUL_ACL_TOKEN:""}
      k8s-configmap:
        period: ${SW_CONFIG_CONFIGMAP_PERIOD:60}
        namespace: ${SW_CLUSTER_K8S_NAMESPACE:default}
        labelSelector: ${SW_CLUSTER_K8S_LABEL:app=collector,release=skywalking}
      nacos:
        # Nacos Server Host
        serverAddr: ${SW_CONFIG_NACOS_SERVER_ADDR:127.0.0.1}
        # Nacos Server Port
        port: ${SW_CONFIG_NACOS_SERVER_PORT:8848}
        # Nacos Configuration Group
        group: ${SW_CONFIG_NACOS_SERVER_GROUP:skywalking}
        # Nacos Configuration namespace
        namespace: ${SW_CONFIG_NACOS_SERVER_NAMESPACE:}
        # Unit seconds, sync period. Default fetch every 60 seconds.
        period: ${SW_CONFIG_NACOS_PERIOD:60}
        # Nacos auth username
        username: ${SW_CONFIG_NACOS_USERNAME:""}
        password: ${SW_CONFIG_NACOS_PASSWORD:""}
        # Nacos auth accessKey
        accessKey: ${SW_CONFIG_NACOS_ACCESSKEY:""}
        secretKey: ${SW_CONFIG_NACOS_SECRETKEY:""}
    
    exporter:
      selector: ${SW_EXPORTER:-}
      grpc:
        targetHost: ${SW_EXPORTER_GRPC_HOST:127.0.0.1}
        targetPort: ${SW_EXPORTER_GRPC_PORT:9870}
    
    health-checker:
      selector: ${SW_HEALTH_CHECKER:-}
      default:
        checkIntervalSeconds: ${SW_HEALTH_CHECKER_INTERVAL_SECONDS:5}
    View Code

    2.4 创建&启动UI

    docker run -d --name skywalking-ui \
    --restart=always \
    -e TZ=Asia/Shanghai \
    -p 8101:8080 \
    --link oap:oap \
    -e SW_OAP_ADDRESS=oap:12800 \
    apache/skywalking-ui:8.3.0

    2.5 下载源码包

    wget https://mirrors.tuna.tsinghua.edu.cn/apache/skywalking/8.3.0/apache-skywalking-apm-8.3.0.tar.gz

    官网的源码包下载比较慢,就换成tuna的了。下载好了之后,解压在/opt目录下,暂时不用管,这个在后面会用到agent。

    v部署/接入SkyWalking

    3.1 生成springboot JAR包

    若生成springbootJAR包有疑惑的,可以看看这篇文章。SpringBoot入门教程(二)CentOS部署SpringBoot项目从0到1

    3.2 启动JAR包

    nohup java -javaagent:/opt/apache-skywalking-apm-bin/agent/skywalking-agent.jar -Dskywalking.agent.service_name=toutou_blog -Dskywalking.collector.backend_service=127.0.0.1:11800 -jar /data/package/learn-web-0.0.1-SNAPSHOT.jar -d --server.port=8100 &

    -javaagent:用于指定探针路径。指定agent包位置。在上面的步骤中已经将apache-skywalking-apm-8.3.0.tar.gz解压到/opt目录了,因此路径为:/opt/apache-skywalking-apm-bin/agent/skywalking-agent.jar

    -Dskywalking.agent.service_name:用于重写 agent/config/agent.config 配置文件中的服务名

    -Dskywalking.collector.backend_service:用于重写agent/config/agent.config 配置文件中的服务地址

    3.3 访问UI

    第一次访问时,需要先调用springboot中的接口后,SkyWalking UI中即会load对应的信息,效果如下图。

    SpringBoot进阶教程(七十)SkyWalking

    拓扑图/topology:

    SpringBoot进阶教程(七十)SkyWalking

    追踪/trace:

    SpringBoot进阶教程(七十)SkyWalking