Telegraf and Grafana monitor SQL Server on multiple platforms

Keywords: SQL Sever InfluxDB Database SQL RPM

problem

SQL Server is deployed on multiple platforms (Windows,Linux and Container) in many enterprises. A solution that can support multiple platforms is needed to collect and display related monitoring indicators.

I choose the popular monitoring display tool Grafana and monitoring indicator collection tool Telegraf to implement. This is also to facilitate the integration and docking with the existing monitoring platform in the enterprise.

As shown in the above figure, Telegraf is deployed in the host where SQL is located, collects data and sends it to the temporal database, Influxdb for storage, and then Grafana is used to display the data.

Solution

  • Install and configure InfluxDB

I install InfluxDB and Grafana on the same CentOS host, preferably separately in the production environment.

# Download the stable version of 1.8 and install it
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.8.0.x86_64.rpm
chmod 755 influxdb-1.8.0.x86_64.rpm
yum localinstall influxdb-1.8.0.x86_64.rpm

# Start and set self start
systemctl start influxdb
systemctl enable influxdb

# 8086 is used for HTTP connection of client and 8088 is used for CLI to call RPC for backup and restore operation
firewall-cmd --zone=public --add-port=8086/tcp --permanent
firewall-cmd --zone=public --add-port=8088/tcp --permanent
firewall-cmd --reload

# Connect to influxdb and create users
fluxdb
> CREATE USER admin WITH PASSWORD '<password>' WITH ALL PRIVILEGES

# Enable http user authentication, modify influxdb.conf Auth enabled = true in http section
vim /etc/influxdb/influxdb.conf
systemctl restart influxdb

# Create a database to store monitoring data and keep the data for 6 months
influx -username 'admin' -password '<password>'
> CREATE DATABASE telegraf
> CREATE RETENTION POLICY telegraf_6m ON telegraf DURATION 180d REPLICATION 1 DEFAULT
> SHOW DATABASES
  • Installing and configuring Grafana

# Download and install Grafana
wget https://dl.grafana.com/oss/release/grafana-7.0.1-1.x86_64.rpm
chmod 775 grafana-7.0.1-1.x86_64.rpm
yum localinstall grafana-7.0.1-1.x86_64.rpm

# Set self start
systemctl start grafana-server.service
systemctl enable grafana-server.service

# Allow Grafana's default port 3000
firewall-cmd --zone=public --add-port=3000/tcp --permanent
firewall-cmd --reload

Then visit http://:3000 in the Browser. When you first visit, the default login account and password are admin. You will be prompted to change the password after login.

  • Installing and configuring Telegraf on client hosts

The so-called client is the host of SQL

To connect to SQL, Telegraf requires a login with VIEW SERVER STATE and VIEW ANY DEFINITION permission, so it needs to be created on each monitored instance.

USE master;
GO
CREATE LOGIN [telegraf] WITH PASSWORD = N'1qaz@WSX';
GO
GRANT VIEW SERVER STATE TO [telegraf];
GO
GRANT VIEW ANY DEFINITION TO [telegraf];
GO
  • Telegraf on Linux
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.14.3-1.x86_64.rpm
sudo yum localinstall telegraf-1.14.3-1.x86_64.rpm 

After installation, modify the configuration file of Telegraf before starting. In the configuration file, there are two parts: input and output. Input indicates where the monitoring data comes from, and output indicates where the monitoring data is to be sent.

Open / etc/telegraf/telegraf.conf , found[[ outputs.influxdb ]]Section, all configuration items are annotated by default. We need to remove comments and configure some items. It is mainly the address, user name, password and database name of the Influxdb.

[[outputs.influxdb]]
  ## The full HTTP or UDP URL for your InfluxDB instance.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  # urls = ["unix:///var/run/influxdb.sock"]
  # urls = ["udp://127.0.0.1:8089"]
  urls = ["http://172.17.2.4:8086"]

  ## The target database for metrics; will be created as needed.
  ## For UDP url endpoint database needs to be configured on server side.
  database = "telegraf"

  ## The value of this tag will be used to determine the database.  If this
  ## tag is not set the 'database' option is used as the default.
  # database_tag = ""

  ## If true, the 'database_tag' will not be included in the written metric.
  # exclude_database_tag = false

  ## If true, no CREATE DATABASE queries will be sent.  Set to true when using
  ## Telegraf with a user without permissions to create databases or when the
  ## database already exists.
  skip_database_creation = true

  ## Name of existing retention policy to write to.  Empty string writes to
  ## the default retention policy.  Only takes effect when using HTTP.
  retention_policy = ""

  ## The value of this tag will be used to determine the retention policy.  If this
  ## tag is not set the 'retention_policy' option is used as the default.
  # retention_policy_tag = ""

  ## If true, the 'retention_policy_tag' will not be included in the written metric.
  # exclude_retention_policy_tag = false

  ## Write consistency (clusters only), can be: "any", "one", "quorum", "all".
  ## Only takes effect when using HTTP.
  write_consistency = "any"

  ## Timeout for HTTP messages.
  timeout = "5s"

  ## HTTP Basic Auth
  username = "admin"
  password = "<password>"
  • Find[[ inputs.sqlserver ]]Section, uncomment the related configuration items, and the servers section connects to the local instance.

The default Plugin of Telegraf includes the implementation of SQL Server, which also includes the implementation of Azure SQL PaaS

# # Read metrics from Microsoft SQL Server
 [[inputs.sqlserver]]
#   ## Specify instances to monitor with a list of connection strings.
#   ## All connection parameters are optional.
#   ## By default, the host is localhost, listening on default port, TCP 1433.
#   ##   for Windows, the user is the currently running AD user (SSO).
#   ##   See https://github.com/denisenkom/go-mssqldb for detailed connection
#   ##   parameters, in particular, tls connections can be created like so:
#   ##   "encrypt=true;certificate=<cert>;hostNameInCertificate=<SqlServer host fqdn>"
        servers = [
                        "Server=localhost;Port=1433;User Id=telegraf;Password=<yourPassword>;app name=telegraf;log=1;"
                ]
#
#   ## Optional parameter, setting this to 2 will use a new version
#   ## of the collection queries that break compatibility with the original
#   ## dashboards.
        query_version = 2
#
#   ## If you are using AzureDB, setting this to true will gather resource utilization metrics
#   # azuredb = false
#
#   ## Possible queries:
#   ## - PerformanceCounters
#   ## - WaitStatsCategorized
#   ## - DatabaseIO
#   ## - DatabaseProperties
#   ## - CPUHistory
#   ## - DatabaseSize
#   ## - DatabaseStats
#   ## - MemoryClerk
#   ## - VolumeSpace
#   ## - PerformanceMetrics
#   ## - Schedulers
#   ## - AzureDBResourceStats
#   ## - AzureDBResourceGovernance
#   ## - SqlRequests
#   ## - ServerProperties
#   ## A list of queries to include. If not specified, all the above listed queries are used.
#   # include_query = []
#
#   ## A list of queries to explicitly ignore.
#   exclude_query = [ 'Schedulers' , 'SqlRequests']

After launching Telegraf, you can see the input and collection interval loaded at

[root@SQL19N1 log]# systemctl status telegraf      
โ— telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
   Loaded: loaded (/usr/lib/systemd/system/telegraf.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2020-05-26 14:19:07 UTC; 19min ago
     Docs: https://github.com/influxdata/telegraf
 Main PID: 12359 (telegraf)
   CGroup: /system.slice/telegraf.service
           โ””โ”€12359 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d

May 26 14:19:07 SQL19N1 systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.
May 26 14:19:07 SQL19N1 telegraf[12359]: 2020-05-26T14:19:07Z I! Starting Telegraf 1.14.3
May 26 14:19:07 SQL19N1 telegraf[12359]: 2020-05-26T14:19:07Z I! Loaded inputs: system cpu disk diskio kernel mem processes swap sqlserver
May 26 14:19:07 SQL19N1 telegraf[12359]: 2020-05-26T14:19:07Z I! Loaded aggregators:
May 26 14:19:07 SQL19N1 telegraf[12359]: 2020-05-26T14:19:07Z I! Loaded processors:
May 26 14:19:07 SQL19N1 telegraf[12359]: 2020-05-26T14:19:07Z I! Loaded outputs: influxdb
May 26 14:19:07 SQL19N1 telegraf[12359]: 2020-05-26T14:19:07Z I! Tags enabled: host=SQL19N1
May 26 14:19:07 SQL19N1 telegraf[12359]: 2020-05-26T14:19:07Z I! [agent] Config: Interval:20s, Quiet:false, Hostname:"SQL19N1", Flush Interval:10s
  • Telegraf on Windows

Execute the following PowerShell command as Administrator

# Download a software
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.14.3_windows_amd64.zip ยท
    -OutFile "c:\temp\telegraf-1.14.3_windows_amd64.zip"

# Extract to C:\Program Files\Telegraf
Expand-Archive "c:\temp\telegraf-1.14.3_windows_amd64.zip", "C:\Program Files"

# Installing telegraf as a windows Service
C:\"Program Files"\Telegraf\telegraf.exe --service install

modify telegraf.conf in outputs.influxdb And add inputs.sqlserver Part 2, these contents are the same as the configuration on Linux, so we won't go into details.

After the modification of conf, you can test whether the telegraf can start normally. If there is no problem, start the telegraf service.

# test
C:\"Program Files"\Telegraf\telegraf.exe --config C:\"Program Files"\Telegraf\telegraf.conf --test

# Start service
C:\"Program Files"\Telegraf\telegraf.exe --service start
  • Configure the data source and Dashboard of Grafana

After logging in to Grafana, configure the InfluxDB data source in the configuration - > data source on the left, fill in the address, account, password and set it as the default data source, as shown in the following figure

Dashboard s can be created by themselves or in the open community (thanks to the enthusiastic and selfless tycoons). Here, I use SQL Servers by Jonathan Rioux . The Piechart used in this Dashboard is not preset by Grafana, so you need to install:

# The Host of Grafana is installed, and the service restart takes effect
grafana-cli plugins install grafana-piechart-panel
systemctl restart grafana-server.service

Then in the Grafana interface, select dashboard - > Import - > fill in dashboard ID - > Import on the left, as shown below:

After configuration, you can see that the information provided by this Dashboard is rich, and you can also modify and add relevant content according to your own needs

summary

  • In fact, the data collection and report can not fully meet the business requirements. The customized data collection and Dashboard are also very easy to implement. Write again next time

  • If Zabbix is already in use, Grafana can directly output data to Zabbix.

  • Telegraf can support Cloud environment very well. Next time, let's talk about the monitoring of Azure SQL PaaS

  • The content of this article only represents personal views and is not related to any company or organization

Posted by hermand on Mon, 01 Jun 2020 03:48:16 -0700