Apache Ignite 2.7.5 Monitoring metrics

邮差的信 提交于 2020-12-15 01:52:12

问题


This is the ignite(version 2.7.5) configuration that I am using for my 2-node PARTITIONED cluster.

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans.xsd">
    <!-- Enable annotation-driven caching. -->


    <bean name="noOpFailureHandler" class="org.apache.ignite.failure.NoOpFailureHandler"/>
    <bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="peerClassLoadingEnabled" value="true"/>
        <property name="igniteInstanceName" value="GridA"/>
        <property name="clientMode" value="false"/>
        <property name="failureDetectionTimeout" value="80000"/>
        <property name="clientFailureDetectionTimeout" value="120000"/>
        <property name="systemWorkerBlockedTimeout" value="30000" />
        <property name="longQueryWarningTimeout" value="3000"/>
        <property name="failureHandler" ref="noOpFailureHandler"/>
        <property name="metricsLogFrequency" value="#{600 * 10 * 1000}"/>
        <property name="rebalanceThreadPoolSize" value="16"/>
        <property name="dataStorageConfiguration">
            <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
                <!-- Redefining the default region's settings -->
                <property name="pageSize" value="#{4 * 1024}"/>
                <!--<property name="writeThrottlingEnabled" value="true"/>-->
                <property name="defaultDataRegionConfiguration">
                    <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                        <property name="persistenceEnabled" value="true"/>
                        <property name="initialSize" value="#{105L * 1024 * 1024 * 1024}"/>
                        <property name="name" value="Default_Region"/>
                        <!--Setting the size of the default region to 4GB. -->
                        <property name="maxSize" value="#{120L * 1024 * 1024 * 1024}"/>
                        <property name="checkpointPageBufferSize"
                                  value="#{4096L * 1024 * 1024}"/>
                        <!--<property name="pageEvictionMode" value="RANDOM_2_LRU"/>-->
                    </bean>
                </property>
                <property name="walPath" value="/wal/grid"/>
                <property name="walArchivePath" value="/wal/grid/archive"/>
                <property name="storagePath" value="/ignite/persistence"/>
                <property name="checkpointFrequency" value="180000"/>
                <property name="checkpointThreads" value="8"/>
                <property name="walMode" value="BACKGROUND"/>
                <property name="walSegmentSize" value="#{1L * 1024 * 1024 * 1024}"/>
                <!--<property name="authenticationEnabled" value="true"/>-->
            </bean>
        </property>

        <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                    <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
                        <property name="multicastGroup" value="224.0.0.180"/>
                        <property name="multicastPort" value="47514"/>
                    </bean>
                </property>

            </bean>
        </property>
        <property name="communicationSpi">
            <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
                <property name="messageQueueLimit" value="2048"/>
                <property name="socketWriteTimeout" value="10000"/>
                <property name="connectionsPerNode" value="10"/>
                <property name="usePairedConnections" value="true"/>
                <property name="socketReceiveBuffer" value="#{64L * 1024}"/>
            </bean>
        </property>
    </bean>


</beans>

Ignite is started with the following JVM parameters:

/usr/java/jdk1.8.0_144/bin/java -XX:+AggressiveOpts -server -Xms20g -Xmx20g -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/etappdata/ignite/logs/PROD/etail-prod-ignite76-164/logs -XX:+ExitOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -Xloggc:/etappdata/ignite/logs/PROD/etail-prod-ignite76-164/gc.log -XX:+PrintAdaptiveSizePolicy -XX:+UseTLAB -verbose:gc -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Addresses=true -Djava.net.preferIPv6Stack=false -Djava.net.preferIPv6Addresses=false -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8996 -Dcom.sun.management.jmxremote.rmi.port=8996 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.local.only=false -Djava.rmi.server.hostname=etail-prod-ignite76-164 -XX:MaxDirectMemorySize=4g -javaagent:/tmp/apminsight-javaagent-prod/apminsight-javaagent.jar -Dfile.encoding=UTF-8 -XX:+UseG1GC -DIGNITE_QUIET=false -DIGNITE_SUCCESS_FILE=/ignite/apache-ignite-2.7.5-bin/work/ignite_success_0cbecd49-5b7f-4a41-b2f2-42bb66b2ea5c -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=49128 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -DIGNITE_HOME=/ignite/apache-ignite-2.7.5-bin -DIGNITE_PROG_NAME=./bin/ignite.sh -cp /ignite/apache-ignite-2.7.5-bin/libs/:/ignite/apache-ignite-2.7.5-bin/libs/ignite-indexing/:/ignite/apache-ignite-2.7.5-bin/libs/ignite-spring/:/ignite/apache-ignite-2.7.5-bin/libs/licenses/ org.apache.ignite.startup.cmdline.CommandLineStartup config/my-cache.xml

[Note: Each node has 210 GB RAM]

I am getting metrics like the following every 100 mins as mentioned in the config:

[00:33:36,452][INFO][grid-timeout-worker-#67%GridA%][IgniteKernal%GridA] 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=92dda713, name=GridA, uptime=01:40:00.019]
    ^-- H/N/C [hosts=10, nodes=10, CPUs=172]
    ^-- CPU [cur=2.13%, avg=2.16%, GC=0%]
    ^-- PageMemory [pages=5535967]
    ^-- Heap [used=6605MB, free=67.75%, comm=20480MB]
    ^-- Off-heap [used=21878MB, free=82.24%, comm=123179MB]
    ^--   sysMemPlc region [used=0MB, free=99.99%, comm=99MB]
    ^--   metastoreMemPlc region [used=0MB, free=99.77%, comm=99MB]
    ^--   Default_Region region [used=21878MB, free=82.2%, comm=122880MB]
    ^--   TxLog region [used=0MB, free=100%, comm=99MB]
    ^-- Ignite persistence [used=281575MB]
    ^--   sysMemPlc region [used=0MB]
    ^--   metastoreMemPlc region [used=unknown]
    ^--   Default_Region region [used=281575MB]
    ^--   TxLog region [used=0MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=6, qSize=0]

Q: What should I do to get more specific monitoring metrics? Is there any implication if I change the metricsLogFrequency to 1 min?

Should I add the following in the configuration file?

<!-- Enable metrics for this data region  -->
<property name="metricsEnabled" value="true"/>

How can I see more monitoring metrics like pagesUsed, pagesReplaced, pagesFillFactor etc ?

Or should I add code in the client application like:

Ignite ignite = Ignition.ignite("GridA");
List<DataRegionMetrics> dataRegionMetricsList = new ArrayList<>(ignite.dataRegionMetrics());
dataRegionMetricsList.forEach(
                dataRegionMetrics -> LOG.info(dataRegionMetrics.getName() + ": " + dataRegionMetrics.getAllocationRate() + ":"
                        + dataRegionMetrics.getPagesFillFactor() + ":" + dataRegionMetrics.getPagesReplaceRate())
);

Please help!

来源:https://stackoverflow.com/questions/65125071/apache-ignite-2-7-5-monitoring-metrics

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!