Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. Supports clusters up to 2000 nodes in size.
Ganglia can scale to handle clusters with thousands of nodes
Project Admins:
cygwin 1.7 + ganglia 3.5
gmond.c:160: error: parse error before '*' token
gmond.c:160: warning: type defaults to
int' in declaration ofhosts_mutex'gmond.c:160: warning: data definition has no type or storage class
gmond.c: In function
Ganglia_host_get': gmond.c:1029: warning: implicit declaration of functionapr_thread_mutex_create'gmond.c:1029: error:
APR_THREAD_MUTEX_DEFAULT' undeclared (first use in this function) gmond.c:1029: error: (Each undeclared identifier is reported only once gmond.c:1029: error: for each function it appears in.) gmond.c:1055: warning: implicit declaration of functionapr_thread_mutex_lock'gmond.c:1057: warning: implicit declaration of function
apr_thread_mutex_unlock' gmond.c: In functiontcp_listener':gmond.c:3056: warning: implicit declaration of function
apr_thread_exit' gmond.c: In functionmain':gmond.c:3174: error: `APR_THREAD_MUTEX_DEFAULT' undeclared (first use in this function)
Last edit: char tao 2013-03-08
am using Ganglia to monitor huge infrastructure with more than 300 nodes, But the central machine which collecting data from those nodes by gmetad has very high cpu load due to heavy I/O operations, i tried to put the rrds files in ramdisk it gets better but still has load about 9!! Any one has resolution for this please help me as this cause me panic.
Thanks
rrdcached is a daemon that receives updates to existing RRD files,
accumulates them and, if enough have been received or a defined time has
passed, writes the updates to the RRD file. A flush command may be used to
force writing of values to disk, so that graphing facilities and similar
can work with up-to-date data.
2013/11/21 Ayman ayman-shorman@users.sf.net
Great software, really couldnt ask more from it
Hello:
I using ganglia-3.7.1 on aix7.1 of IBM POWER6 . When I configure it , receive error messages:
Checking for confuse
checking for cfg_parse in -lconfuse... no
Trying harder including gettext
checking for cfg_parse in -lconfuse... no
Trying harder including iconv
checking for cfg_parse in -lconfuse... no
libconfuse not found
But I have installed libconfuse-2.7-1 and libconfuse-devel-2.7-1
Please help me!
--with-libconfuse=/usr/lib or LDFLAGS="-L /usr/lib"
2016-04-01 10:39 GMT+08:00 jigli jigli@users.sf.net:
I have to monitor fds for perticular processes. I did below changes in ganglia configuration but I am getting blank graph.When same script I run through console I get output.
***In this path I added /usr/lib64/ganglia/pythonmodules/
procfds.py: *
import os
OBSOLETE_POPEN = False
try:
import subprocess
except ImportError:
import popen2
OBSOLETE_POPEN = True
import threading
import time
_refresh_rate = 30 # Refresh rate of the netstat data
_conns = {'process_fds': 0}
def TCP_Connections(name):
global tempconns
tempconns= []
pid = file('/var/run/computenode.pid', 'rt').readline().strip()
process = subprocess.Popen("ls /proc/"+pid+"/fd | wc -l", stdout=subprocess.PIPE,shell=True)
lines = process.communicate()[0].strip()
_conns['process_fds']=lines
ret = int(_conns[name])
return ret
Metric descriptions
_descriptors = [{
'name': 'process_fds',
'call_back': TCP_Connections,
'time_max': 20,
'value_type': 'uint',
'units': '',
'slope': 'both',
'format': '%u',
'description': 'Total number of file descriptor ',
'groups': 'procstat'
}]
def metric_init(params):
'''Initialize the tcp connection status module and create the
metric definition dictionary object for each metric.'''
global _refresh_rate
def metric_cleanup():
'''Clean up the metric module.'''
pass
if name == 'main':
params = {'Refresh': '20'}
metric_init(params)
while True:
try:
for d in _descriptors:
v = d'call_back'
print 'value for %s is %u' % (d['name'], v)
time.sleep(5)
except KeyboardInterrupt:
os._exit(1)
***configuration file: ***
[root@mtl-nes-qa3-cn1 conf.d]# cat proc_fds.pyconf
modules {
module {
name = 'proc_fds'
language = 'python'
}
}
collection_group {
collect_every = 30
time_threshold = 30
metric {
name = "process_fds"
value_threshold = "256.0"
title = "process_fds"
}
}
Please let me know why blank graph is showing and also value_threshold is not changing its 0.0/0.5 range
Hello,
I'm using ganglia-3.7.2. And I have one secruity issue when doing secruity scanning:
Description:
Unix-based systems support variable settings to control access to files. World writable files are the least secure. See the chmod(2) man page for more information.
*Rationale: *
Data in world-writable files can be modified and compromised by any user on the system. World writable files may also indicate an incorrectly written script or program that could potentially be the cause of a larger compromise to the system's integrity.
*Remediation: *
Removing write access for the "other" category ( chmod o-w <filename> ) is advisable, but always consult relevant vendor documentation to avoid breaking any application dependencies on a given file.</filename>
*Assessment: *
Ensure no world writable files exist -- Less
Script: sce/world_writable_files.sh
Standard Output:
World-Writable file /var/log/sdc_alarm.log
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/host_extra.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/cluster_host_metric_graphs.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/host_view.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/cluster_overview.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/footer.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/metric_group_view.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/host_overview.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/cluster_extra.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/cluster_view.tpl.d17.php
World-Writable file /var/lib/ganglia/dwoo/compiled/templates/default/header.tpl.d17.php
Standard Error:
No Standard Error was produced
I found these files were be generated automatically when user accessed gweb GUI. Do we have resoluton or other opinions about secruity scanning? Please help. Thanks.
Dear all
do you have metrics avaialable in Ganglia for Spark memory variables such as :
Input, Storage Memory, Shuffle Read and Shuffle Writte
for the Active tasks for driver and executors for each application_xxxxxx
and for the cluster as a whole please? I attache the Memory settigs we got now in static manner only from Spark JOb HIstory as what we want to have in Ganglia :
Please help. Thanks.