bigsister-general Mailing List for Big Sister
Brought to you by:
aeby
You can subscribe to this list here.
| 2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2001 |
Jan
(3) |
Feb
|
Mar
(15) |
Apr
(5) |
May
(1) |
Jun
|
Jul
|
Aug
(16) |
Sep
(39) |
Oct
(8) |
Nov
(53) |
Dec
(101) |
| 2002 |
Jan
(34) |
Feb
(23) |
Mar
(28) |
Apr
(165) |
May
(58) |
Jun
(39) |
Jul
(15) |
Aug
(61) |
Sep
(26) |
Oct
(24) |
Nov
(9) |
Dec
(3) |
| 2003 |
Jan
(24) |
Feb
(17) |
Mar
(32) |
Apr
(5) |
May
(11) |
Jun
(49) |
Jul
(52) |
Aug
(50) |
Sep
(49) |
Oct
(76) |
Nov
(138) |
Dec
(72) |
| 2004 |
Jan
(49) |
Feb
(44) |
Mar
(116) |
Apr
(27) |
May
(72) |
Jun
(67) |
Jul
(49) |
Aug
(48) |
Sep
(35) |
Oct
(23) |
Nov
(47) |
Dec
(42) |
| 2005 |
Jan
(15) |
Feb
(29) |
Mar
(40) |
Apr
(30) |
May
(24) |
Jun
(25) |
Jul
(36) |
Aug
(27) |
Sep
(13) |
Oct
(33) |
Nov
(78) |
Dec
(69) |
| 2006 |
Jan
(18) |
Feb
(33) |
Mar
(31) |
Apr
(39) |
May
(32) |
Jun
(25) |
Jul
(27) |
Aug
(15) |
Sep
(21) |
Oct
(23) |
Nov
(5) |
Dec
(30) |
| 2007 |
Jan
(33) |
Feb
(24) |
Mar
(25) |
Apr
(35) |
May
(38) |
Jun
(1) |
Jul
(12) |
Aug
(14) |
Sep
(36) |
Oct
(15) |
Nov
(17) |
Dec
(23) |
| 2008 |
Jan
(27) |
Feb
(26) |
Mar
(34) |
Apr
(21) |
May
(9) |
Jun
(4) |
Jul
(24) |
Aug
(13) |
Sep
(9) |
Oct
(10) |
Nov
|
Dec
(5) |
| 2009 |
Jan
|
Feb
(11) |
Mar
(5) |
Apr
(2) |
May
|
Jun
(15) |
Jul
(11) |
Aug
|
Sep
(8) |
Oct
|
Nov
|
Dec
(8) |
| 2010 |
Jan
(7) |
Feb
(3) |
Mar
|
Apr
(1) |
May
|
Jun
(9) |
Jul
(4) |
Aug
|
Sep
(5) |
Oct
(3) |
Nov
(4) |
Dec
(4) |
| 2011 |
Jan
(4) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2012 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(8) |
Jun
(5) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
| 2016 |
Jan
|
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Thomas A. <ae...@gr...> - 2016-04-20 17:10:50
|
On 20.04.2016 18:42, Pablo Salvia wrote: > directory "graphs" has these permissions: > > drwxrwx--- 2 root bigsis 81920 Apr 18 13:14 graphs That might be the issue ... could you please try changing the permissions with chmod o+rx . (within graphs directory)? Kind regards, Tom -- ------------------------------------------------------------------------ Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland Voice : (+41)26 4180040 Internet: ae...@gr... PGP public key available ------------------------------------------------------------------------ |
|
From: Thomas A. <ae...@gr...> - 2016-04-20 15:53:18
|
Hi Pablo,
On 20.04.2016 16:32, Pablo Salvia wrote:
> rrdtool seems to be in
> the path variable (although i found that variable in several files,
> and i dont know if i edited the correct one). /var/graphs path has
> files, .rrd seem to be updated regularly
this implies that BigSister was able to find rrdtool, to setup the
RRD databases and is pushing data in. So, the good news is that the rest
must be some rather small issue.
> graph files have dates that go back 2 years ago
> when bigsister was installed (they seem to be some sort of template
> for the .rrd files anyway, i´m i wrong?).
You are mostly right. They contain some meta information.
> Graph for host sbackup2012, ID sbackup2012.0.cpuload% unavailable at
> /usr/share/bigsister/cgi/bshistgraph line 363.
Now this is strange. Could you please verify:
- that a file called "index" exists in the same directory as all the
.rrd files and you actually find an entry sbackup2012.0.cpuload%
therein (rather, you should find 3 of them)
- try
grep "sbackup2012.0.cpuload%" *.graph
- you should get a few listings (probably 3)
- that the directory and files within are public readable (bshistgraph
will run under a different user account than the other BigSister processes
and still needs to be able to read the data)
Kind regards,
Tom
--
------------------------------------------------------------------------
Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland
Voice : (+41)26 4180040
Internet: ae...@gr... PGP public key available
------------------------------------------------------------------------
|
|
From: Pablo S. <lal...@gm...> - 2016-04-20 14:32:53
|
Hi, I know this issue is pretty common but any of the solutions that seemed to work for others do not work in my case, probably because i´m not very familiar with linux or bigsister. rrdtool seems to be in the path variable (although i found that variable in several files, and i dont know if i edited the correct one). /var/graphs path has files, .rrd seem to be updated regularly (at least the ones reporting stuff from the hosts) graph files have dates that go back 2 years ago when bigsister was installed (they seem to be some sort of template for the .rrd files anyway, i´m i wrong?). The errors i get are the following: Graph for host sbackup2012, ID sbackup2012.0.cpuload% unavailable-------------------- i get this for every host and all of the reports, and in all of the ranges (6h, 2d...etc). If i select any of the hosts in the checkbox and then click on the..."right pointing arrow" (yeah that´s not the name for it i know) i get this error: Graph for host sbackup2012, ID sbackup2012.0.cpuload% unavailable at /usr/share/bigsister/cgi/bshistgraph line 363. If there´s any more information i can provide or any test i should perform please tell me how. It´s been a while since i´m trying to solve this issue with no success, so any help will be appreciated. Thanks!!! |
|
From: Kai S. <mai...@co...> - 2015-12-02 17:31:12
|
Thomas Aeby wrote on Wed, 2 Dec 2015 17:10:10 +0100: > That's because it's one of the "old style" tests. And there is no > manpage either :-( Yeah, I couldn't find any documentation on syntax. Only the one I used in my config file and a few mentions on the mailing list, but no example. So I dug for the code. > Indeed, that's very limitted and does result in a problem if you are > monitoring a DNS server which does not resolve uxmon's host. Ah, well, thinking about this a second time I now understand how the config was supposed to be done. I was using it the wrong way! I used the IP number for the nameserver. This way it did a reverse dns lookup against itself, which didn't work all the time, anyway. But the remote dns gave a NOERROR back, so it succeeded for bs. Now the ns vendor made a new release and changed this to error code REFUSED - which made bs go red and me have a look. So, using something like ns1.example.org(DNS) host=ns1.example.org dns should have worked. But I used the IP number. I know this or a similar config line used to work years ago, but then I had a problem, asked on the list and came up with the query= syntax which worked (but not in the intended way, which I didn't notice). Here's what you should use for syntax with the corrected code you checked in: hostname or IP number(bs hostname) query="host or zone to be checked" host="nameserver hostname or IP number" item="column in monitoring display" dns > Other nice things to do in a DNS check: check zone consistency via SOA queries. You mean, check master and slave zones for same version? Yeah, if you check a lot of domains, this might be helpful. I just want to check if the nameserver is up and responds. Cheers to you both! Kai -- Get your web at Conactive Internet Services: http://www.conactive.com |
|
From: Thomas A. <ae...@gr...> - 2015-12-02 16:25:49
|
Hello Kai,
On 02.12.2015 15:00, Kai Schaetzl wrote:
> Let's see if this mailing list still exists. :-)
At least two ... no, three ... members are still here :-)
> I've found a bug today in the dns test. I don't know why but this test is
> not aware to "testers" comamnd,
That's because it's one of the "old style" tests. And there is no
manpage either :-(
> but it works, sort of.
> But the check added to bs is incorrect and queries "just itself"
Indeed, that's very limitted and does result in a problem if you are
monitoring a DNS server which does not resolve uxmon's host.
> wrong code (at the end of the file):
> $cmd->add_check( $args{"alias"}.".dns", "$dnscomm $args{host} $args
> {host}",
> [ 0, "green", "dns OK" ], [ "timeout", "red", "dns
> TIMEOUT" ], [ "*", "red", "dns FAILURE" ] );
>
> correct code:
> $cmd->add_check( $args{"alias"}.".".$args{"item"}, "$dnscomm $args{query}
> $args{host}",
> [ 0, "green", "dns OK" ], [ "timeout", "red", "dns
> TIMEOUT" ], [ "*", "red", "dns FAILURE" ] );
>
> e.g. change the first $args{host} to $args{query}
> and the $args{"alias"}.".dns" to $args{"alias"}.".".$args{"item"}
... and for backwards compatibility I'll add defaults for item/query
("dns" and hostname) and check the changes in.
Other nice things to do in a DNS check: check zone consistency via SOA queries.
But then ...
Thanks a lot for you contribution.
Kind regards,
Tom
--
------------------------------------------------------------------------
Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland
Voice : (+41)26 4180040
Internet: ae...@gr... PGP public key available
------------------------------------------------------------------------
|
|
From: Niels B. <nb...@us...> - 2015-12-02 16:02:04
|
Den 02-12-2015 kl. 15:00 skrev Kai Schaetzl: > Let's see if this mailing list still exists. :-) You are not all alone, but I have been wondering the same :-) /Niels -- Niels Baggesen - @home - Århus - Denmark - nb...@us... The purpose of computing is insight, not numbers --- R W Hamming |
|
From: Kai S. <mai...@co...> - 2015-12-02 14:27:20
|
Let's see if this mailing list still exists. :-)
Hello everyone.
I've found a bug today in the dns test. I don't know why but this test is
not aware to "testers" comamnd, but it works, sort of.
But the check added to bs is incorrect and queries "just itself" which led
to an error here today.
file /usr/share/bigsister/uxmon/Config/dns
wrong code (at the end of the file):
$cmd->add_check( $args{"alias"}.".dns", "$dnscomm $args{host} $args
{host}",
[ 0, "green", "dns OK" ], [ "timeout", "red", "dns
TIMEOUT" ], [ "*", "red", "dns FAILURE" ] );
correct code:
$cmd->add_check( $args{"alias"}.".".$args{"item"}, "$dnscomm $args{query}
$args{host}",
[ 0, "green", "dns OK" ], [ "timeout", "red", "dns
TIMEOUT" ], [ "*", "red", "dns FAILURE" ] );
e.g. change the first $args{host} to $args{query}
and the $args{"alias"}.".dns" to $args{"alias"}.".".$args{"item"}
or you won't be able to redirect the test display to the column you want
it to appear.
Kai
--
Get your web at Conactive Internet Services: http://www.conactive.com
|
|
From: Kai S. <mai...@co...> - 2014-06-03 12:31:11
|
Yes, adding the line for messages (just for that file is enough) makes it stop that. And just to finish the mystery I looked at the other vm and found that I had added a node line 5 years ago to it ... It goes like "node reportedname=wantedname" Thanks ! Kai -- Get your web at Conactive Internet Services: http://www.conactive.com |
|
From: Kai S. <mai...@co...> - 2014-06-02 16:31:12
|
Thomas Aeby wrote on Mon, 02 Jun 2014 17:05:02 +0200: > This was only a quick guess ... it's one of the traps of renaming hosts > in BigSister only, and the fact that msgs and disk checks are affected > leads into this direction (default configuration maps I/O errors appearing > in syslog to disk test, so, yes, syslog monitor affects disk test, too). I removed both tests and readded one by one. It's indeed the msgs test that renames the disk test. A single disk test is fine. > > > I can't find any mention of "node" in man syslog.conf. > > Maybe I have not been clear enough. I am talking about the syslog monitor > of BigSister and its configuration in ...etc/syslog (/usr/share/bigsister/etc/syslog > if installed via package manager). Ah, oh, there. I thought you meant renaming the log line in messages or wherever it may appear. I thought that file contains only regexp for conditions. Yes, I'll try that, thanks! Kai -- Get your web at Conactive Internet Services: http://www.conactive.com |
|
From: Thomas A. <ae...@gr...> - 2014-06-02 15:05:25
|
Hello Kai, On 06/02/2014 04:08 PM, Kai Schaetzl wrote: > Kai Schaetzl wrote on Sat, 31 May 2014 18:46:25 +0200: > >> thanks for the suggestion, but I don't think it's what happens here >> (although, I'm gonna try, anyway). This was only a quick guess ... it's one of the traps of renaming hosts in BigSister only, and the fact that msgs and disk checks are affected leads into this direction (default configuration maps I/O errors appearing in syslog to disk test, so, yes, syslog monitor affects disk test, too). > I can't find any mention of "node" in man syslog.conf. Maybe I have not been clear enough. I am talking about the syslog monitor of BigSister and its configuration in ...etc/syslog (/usr/share/bigsister/etc/syslog if installed via package manager). > Quite mysterious. I could understand that some of the tests pick up the > wrong name from somewhere and I might start digging in the source then. A good start is /usr/share/bigsister/uxmon/Config/syslog then - still thinking that you ran into the usual syslog monitor troubles. Kind regards, Tom -- ------------------------------------------------------------------------ Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland Voice : (+41)26 4180040 Internet: ae...@gr... PGP public key available ------------------------------------------------------------------------ |
|
From: Kai S. <mai...@co...> - 2014-06-02 14:08:23
|
Kai Schaetzl wrote on Sat, 31 May 2014 18:46:25 +0200: > thanks for the suggestion, but I don't think it's what happens here > (although, I'm gonna try, anyway). I can't find any mention of "node" in man syslog.conf. I tried both versions and it doesn't change anything in the logging name. I presume if it exists for one logging software (I'm using plain old syslog) it may be rather for renaming remote host entries. Or did you mean to talk about /etc/sysconfig/syslog? I've also tried changing the .bashrc for root as I did on the other. No change. I see in syslog that it starts monitoring virtual2.disk, so it's not even reporting it's also doing virtual.disk checks. Quite mysterious. I could understand that some of the tests pick up the wrong name from somewhere and I might start digging in the source then. But doing additional tests and only doing that on one vm, although the other is configured almost the same, is quite mysterious and should be fixable without touching the code. Kai -- Get your web at Conactive Internet Services: http://www.conactive.com |
|
From: <ja...@mo...> - 2014-06-01 13:14:20
|
oops, stupid amavis was blocking the message.. it seems to work fine now.. thanks! Jason |
|
From: <ja...@mo...> - 2014-05-31 22:39:26
|
so Im trying that, and I see that bigsister is trying to send an alert tail -f messages | grep "sending page" May 31 18:12:59 monsterjam bsmon: sending page for tf1.msgs with severity 50 via /usr/lib/sendmail to jason: warning, syslog looks fine May 31 18:12:59 monsterjam bsmon: sending page for tf1.disk with severity 50 via /usr/lib/sendmail to jason: no errors logged|>&green /home 2.5GB (41%) free, &green /boo May 31 18:17:57 monsterjam bsmon: sending page for tf1.disk with severity 50 via /usr/lib/sendmail to jason: |>&yellow /var 43.3MB (8.8%) free|>no errors logged, &green but it never seems to actually send the mail.. [jason@monsterjam ~]$ ls -al /usr/lib/sendmail lrwxrwxrwx 1 root root 30 Mar 6 14:11 /usr/lib/sendmail -> /etc/alternatives/mta-sendmail [jason@monsterjam ~]$ ls -al /etc/alternatives/mta-sendmail lrwxrwxrwx 1 root root 25 Mar 6 14:11 /etc/alternatives/mta-sendmail -> /usr/lib/sendmail.postfix [jason@monsterjam ~]$ ls -al /usr/lib/sendmail.postfix lrwxrwxrwx 1 root root 24 Mar 6 14:11 /usr/lib/sendmail.postfix -> ../sbin/sendmail.postfix [jason@monsterjam ~]$ ls -al /usr/sbin/sendmail.postfix -rwxr-xr-x 1 root root 207696 Feb 20 05:06 /usr/sbin/sendmail.postfix [jason@monsterjam ~]$ the sendmail seems ok.. and other emails get deliverd.. regards, Jason On Sat, May 31, 2014 at 05:06:07PM +0200, Thomas Aeby wrote: > On 05/31/2014 01:11 PM, Jason Welsh wrote: > > > > wow thats quite a detailed explanation.. thanks.. so that leads me to > > my next question. as a test, I set my filsystem to be almost full by > > creating a null file.. The disk went "yellow" but I never got an > > email notification.. Ive been looking at the docs and I cant figure > > out how either. 1. make Bigsis email an alert on that yellow, > > (probably not what I want) > > In order to get an alarm already when the disk test goes yellow > you would add something like > > *.disk down=yellow > > to /etc/bigsister/bb_event_generator.cfg > > > or 2. scale back the yellow and red > > thresholds for the disk full notifications.. > > Indeed, by default the "fail" (red) level is 5%, which might never > actually be reached on your system. > > You can change the warn (yellow) and (red) level of the tests by > altering the "disk" test in your uxmon-net or uxmon-asroot file > like > > localhost warn=15% fail=10% disk > > Kind regards, > Tom > ------------------------------------------------------------------------ > Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland > Voice : (+41)26 4180040 > Internet: ae...@gr... PGP public key available > ------------------------------------------------------------------------ > -- ================================================ | Jason Welsh ja...@mo... | | http://monsterjam.org DSS PGP: 0x5E30CC98 | | gpg key: http://monsterjam.org/gpg/ | ================================================ |
|
From: Kai S. <mai...@co...> - 2014-05-31 16:46:29
|
Hi Thomas, thanks for the suggestion, but I don't think it's what happens here (although, I'm gonna try, anyway). Two reasons: 1. both hosts have the same "log node", which is indeed "virtual", so both should be reporting "wrong", but only one does. And it's *additional*. 2. it also occurs with the disk test which surely doesn't use syslog, does it? I'm now wondering if it may have something to do with the shell that bs runs under for these tests or as a daemon? I changed .bashrc or bash_login on the virtual2 machine to report "virtual2-virtual:" as the command prompt while the other just reports "virtual:". So that I know where I am when I'm logged in. I thought I changed that only for root logins, but I may have changed it on a global base. I also have a few uxmon-asroot tests running on the virtual2 machine that are not running on the machine reporting duplicates. Maybe it's picking up the "correct" name that way. I'm gonna check how the shell looks for both uxmon daemons. Kai -- Get your web at Conactive Internet Services: http://www.conactive.com |
|
From: Thomas A. <ae...@gr...> - 2014-05-31 15:06:24
|
On 05/31/2014 01:11 PM, Jason Welsh wrote: > > wow thats quite a detailed explanation.. thanks.. so that leads me to > my next question. as a test, I set my filsystem to be almost full by > creating a null file.. The disk went "yellow" but I never got an > email notification.. Ive been looking at the docs and I cant figure > out how either. 1. make Bigsis email an alert on that yellow, > (probably not what I want) In order to get an alarm already when the disk test goes yellow you would add something like *.disk down=yellow to /etc/bigsister/bb_event_generator.cfg > or 2. scale back the yellow and red > thresholds for the disk full notifications.. Indeed, by default the "fail" (red) level is 5%, which might never actually be reached on your system. You can change the warn (yellow) and (red) level of the tests by altering the "disk" test in your uxmon-net or uxmon-asroot file like localhost warn=15% fail=10% disk Kind regards, Tom ------------------------------------------------------------------------ Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland Voice : (+41)26 4180040 Internet: ae...@gr... PGP public key available ------------------------------------------------------------------------ |
|
From: Thomas A. <ae...@gr...> - 2014-05-31 14:56:24
|
Hello Kai, On 05/31/2014 02:08 PM, Kai Schaetzl wrote: > Oh, there's life on the list, so I'm gonna ask something. :-) :-) Yep, it's been a while ... > However, one of them insists on doing msgs and disk tests *additionally* > as "virtual" and I can't find out why it's doing that. This is an annoying (somehow justified, anyway) trap within the log tests: They actually extract the name of the host(s) they should report for from the log itself. Having a look at a syslog entry, you will see that your syslog actually logs under the name "virtual" on the virtual2 machine and uxmon uses this name. This comes in handy when setting up a central syslog hub (one uxmon instance can then monitor the logs of a number of machines and report events for the respective machine), but is a nuisance when trying to virtually rename hosts only within BigSister. However, there is a way out: add a line like node virtual2=virtual (or whas it "node virtual=virtual2" - I never remember well) in your ...etc/syslog configuration. You can set up a whole list of host name translations like that - which are only valid for the syslog monitor, however. Kind regards, Tom -- ------------------------------------------------------------------------ Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland Voice : (+41)26 4180040 Internet: ae...@gr... PGP public key available ------------------------------------------------------------------------ |
|
From: Kai S. <mai...@co...> - 2014-05-31 12:35:55
|
Oh, there's life on the list, so I'm gonna ask something. :-) I have a host that exists in two virtual machine variants. "Internally" it's got the same hostname, because I have an old commercial software running on it that is bound to the hostname coming up when calling hostname. The second one is not actually in production, but a backup of the first, just "in case". Both systems have 1.02-4 rpms installed and are on CentOS 5-latest. hostname command gives back the same hostname on both virtual machines, say "virtual.example.com". So, I'm overriding both of them in uxmon-net to get distinctive hostnames. However, one of them insists on doing msgs and disk tests *additionally* as "virtual" and I can't find out why it's doing that. host1 (external2.example.com, hostname: virtual.example.com): DESCR features=unix,linux,local localhost localhost(external2) memory cpuload disk syslog network reports as external2 for all tests host2 (virtual2.example.com, hostname: virtual.example.com): DESCR features=unix,linux,local localhost localhost(virtual2) memory cpuload disk syslog network reports as virtual2 for all tests and as virtual for msgs and disk (which produces an extra line virtual that I do not want) *in addition*, e.g. it reports twice (mostly at the same time as the other tests, but sometimes not) for these two tests. There is no extra uxmon running on virtual2 and surely no second "disk" or "syslog" test. Admin, Agents, show agents by monitor shows that it is host virtual2.example.com reporting both virtual and virtual2. What's causing these additional tests? I haven't seen any other host doing that ever. They always stick to the (your reported name here) directive. Kai -- Get your web at Conactive Internet Services: http://www.conactive.com |
|
From: Jason W. <ja...@mo...> - 2014-05-31 11:25:16
|
wow thats quite a detailed explanation.. thanks.. so that leads me to my next question. as a test, I set my filsystem to be almost full by creating a null file.. The disk went "yellow" but I never got an email notification.. Ive been looking at the docs and I cant figure out how either. 1. make Bigsis email an alert on that yellow, (probably not what I want) or 2. scale back the yellow and red thresholds for the disk full notifications.. regards, Jason On 05/30/2014 05:35 PM, Thomas Aeby wrote: > Hello Jason, > > On 05/30/2014 08:45 PM, Jason Welsh wrote: >> so on my server that Im monitoring (running uxmon), I see the following >> >> [root@tf1 ~]# df -h | grep sda9 >> /dev/sda9 494M 445M 24M 95% /var >> >> >> on the big sister webpage on the server, the "Disk" status for this host is yellow, but in the disk report, it shows >> >> /dev/sda9 493.8MB 444.6MB 49.1MB 90.1% /var >> >> which is off by quite a bit.. > > This is because df and BigSister use different notions of "free". As you > can see, "free" in BigSister terms is total size - used size, and the used > percentage is computed as used size / total size. df displays > total size - reserved space - used size = free size, and it computes used > percentage as used size / (total size - reserved size). Thus, the different > figures for the free size while total and used are identical. I assume, on > your /var file system 5% are reserved for the superuser. The used percentage > will always differ by 5% (yes, that means, that an empty file system will > be shown as 5% used by df, and a really full filesystem as 105% used, while > BigSister's figures will be 0% and 100% in these two cases). > > Neither of the two approaches is wrong ... > > You'll have to take the meaning of the figures into account when adjusting > warn and fail levels. > > Kind regards, > Tom > |
|
From: Thomas A. <ae...@gr...> - 2014-05-30 21:57:02
|
Hello Jason, On 05/30/2014 08:45 PM, Jason Welsh wrote: > so on my server that Im monitoring (running uxmon), I see the following > > [root@tf1 ~]# df -h | grep sda9 > /dev/sda9 494M 445M 24M 95% /var > > > on the big sister webpage on the server, the "Disk" status for this host is yellow, but in the disk report, it shows > > /dev/sda9 493.8MB 444.6MB 49.1MB 90.1% /var > > which is off by quite a bit.. This is because df and BigSister use different notions of "free". As you can see, "free" in BigSister terms is total size - used size, and the used percentage is computed as used size / total size. df displays total size - reserved space - used size = free size, and it computes used percentage as used size / (total size - reserved size). Thus, the different figures for the free size while total and used are identical. I assume, on your /var file system 5% are reserved for the superuser. The used percentage will always differ by 5% (yes, that means, that an empty file system will be shown as 5% used by df, and a really full filesystem as 105% used, while BigSister's figures will be 0% and 100% in these two cases). Neither of the two approaches is wrong ... You'll have to take the meaning of the figures into account when adjusting warn and fail levels. Kind regards, Tom -- ------------------------------------------------------------------------ Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland Voice : (+41)26 4180040 Internet: ae...@gr... PGP public key available ------------------------------------------------------------------------ |
|
From: Jason W. <ja...@mo...> - 2014-05-30 18:59:00
|
so on my server that Im monitoring (running uxmon), I see the following [root@tf1 ~]# df -h | grep sda9 /dev/sda9 494M 445M 24M 95% /var on the big sister webpage on the server, the "Disk" status for this host is yellow, but in the disk report, it shows /dev/sda9 493.8MB 444.6MB 49.1MB 90.1% /var which is off by quite a bit.. at the top of the webpage, it says Last change: Fri May 30 14:23:43 2014 so it seems to be getting updates... any ideas on what to check? Jason |
|
From: Niels B. <nb...@us...> - 2012-01-20 21:10:54
|
Den 13-01-2012 00:08, Niels Baggesen skrev: > and it actually looks like its behaving, except that the get statement > in the third discover blocks seems to be skipped for all but the first > iteration. Any ideas, Thomas? /Niels -- Niels Baggesen - @home - Århus - Denmark - nb...@us... The purpose of computing is insight, not numbers --- R W Hamming |
|
From: Niels B. <nb...@us...> - 2012-01-12 23:09:00
|
Den 11-01-2012 18:23, Thomas Aeby skrev:
> I wonder, if this does anything near what I expect it to do (build a two level
> list of indexes in allindexes).
Its close, but there are oddities ...
I have the following code now:
> pernode discover {
> debug 3 "discover pdp modules";
> get @${domain}.isxModularDistModuleInfoAlarmStatus;
> pernode set mod.status ${${domain}.isxModularDistModuleInfoAlarmStatus};
> }
>
> pernode discover {
> set oldmods ${thismods};
> set monmods remove( ${monmods}, ${oldmods} );
> set index ${mod.status};
> select thismods index ${mod.status[${i}]} == 1;
> pernode set monmods ${monmods} ${thismods};
> set monlines;
> }
>
> pernode discover for monmods {
> debug 3 "discover mods ${p}";
> get @${domain}.isxModularDistModuleOutputName.${p};
> pernode set line.name ${${domain}.isxModularDistModuleOutputName.${p}};
> select linindex line.name true;
> for lines linindex ${p}.${i};
> pernode set monlines ${monlines} ${lines};
> debug 3 "done ${p}";
> }
and it actually looks like its behaving, except that the get statement
in the third discover blocks seems to be skipped for all but the first
iteration.
uxmon -D 99 gives me this:
> debug apc_pdp_mib: discover pdp modules
> CWorker doing TestWorker=HASH(0x8cc3890) since it is not I/O dependent
> setting snmp.isxModularDistModuleInfoAlarmStatus to 11=1 21=4 7=1 17=1 2=5 22=4 1=1 18=5 23=4 16=5 13=1 6=5 3=1 9=1 12=5 20=4 14=5 15=1 8=5 4=5 24=4 19=4 10=5 5=1
> CWorker doing TestWorker=HASH(0x8cc3668) since it is not I/O dependent
> CWorker doing TestWorker=HASH(0x8b330dc) since it is not I/O dependent
> setting mod.status to 11=1 21=4 7=1 2=5 17=1 22=4 1=1 18=5 13=1 16=5 23=4 6=5 3=1 9=1 12=5 20=4 14=5 15=1 8=5 4=5 24=4 10=5 19=4 5=1
> setting oldmods to
> setting monmods to
> setting index to 11=1 21=4 7=1 17=1 2=5 22=4 1=1 18=5 23=4 16=5 13=1 6=5 3=1 9=1 12=5 20=4 14=5 15=1 8=5 4=5 24=4 19=4 10=5 5=1
> setting thismods to 11=11 3=3 7=7 9=9 17=17 15=15 1=1 13=13 5=5
> setting monmods to 11=11 3=3 7=7 9=9 17=17 15=15 1=1 13=13 5=5
> setting monlines to
> debug apc_pdp_mib: discover mods 11
> CWorker doing TestWorker=HASH(0x8cc4c7c) since it is not I/O dependent
> setting snmp.isxModularDistModuleOutputName.11 to 1=B 6.1 R16 3=B 6.3 Free 2=B 6.2 Free
> CWorker doing TestWorker=HASH(0x8cc5018) since it is not I/O dependent
> CWorker doing TestWorker=HASH(0x8b330dc) since it is not I/O dependent
> setting line.name to 1=B 6.1 R16 3=B 6.3 Free 2=B 6.2 Free
> setting linindex to 1=1 3=3 2=2
> setting lines to 1=11.1 3=11.3 2=11.2
> setting monlines to 1=11.1 3=11.3 2=11.2
> debug apc_pdp_mib: done 11
> debug apc_pdp_mib: discover mods 3
> setting line.name to
> setting linindex to
> setting lines to
> setting monlines to 1=11.1 3=11.3 2=11.2
> debug apc_pdp_mib: done 3
> debug apc_pdp_mib: discover mods 7
> setting line.name to
> setting linindex to
> setting lines to
> setting monlines to 1=11.1 3=11.3 2=11.2
> debug apc_pdp_mib: done 7
> debug apc_pdp_mib: discover mods 9
> setting line.name to
> setting linindex to
> setting lines to
> setting monlines to 1=11.1 3=11.3 2=11.2
> debug apc_pdp_mib: done 9
Any ideas why this is so?
/Niels
--
Niels Baggesen - @home - Århus - Denmark - nb...@us...
The purpose of computing is insight, not numbers --- R W Hamming
|
|
From: Niels B. <nb...@us...> - 2012-01-12 07:58:23
|
On Wed, Jan 11, 2012 at 06:23:12PM +0100, Thomas Aeby wrote: > After a long day of work, I'd see some nightmare approach like :-) I had a feeling that I would have to end up doing something like that > I wonder, if this does anything near what I expect it to do (build a > two level list of indexes in allindexes). I will keep you posted on how this works out! > If everything else fails, it is always possible to add some perl code > to the module which you call via call() from tests.cfg. Use the "procs" > module (sub countprocs) as an example. Yeah, but I really prefer being able to handle it all using the monitoring language. /Niels -- Niels Baggesen - @home - Århus - Denmark - nb...@us... The purpose of computing is insight, not numbers --- R W Hamming |
|
From: Thomas A. <ae...@gr...> - 2012-01-11 17:38:54
|
Well, it seems that double indexed structures are not really what
was expected to be what we need handled, "select" is somewhat
missing an option to make it traverse more than one level.
After a long day of work, I'd see some nightmare approach like
discover {
...
select firstlevel apc.whatever.base;
...
set allindexes;
}
discover for firstlevel {
set index ${apc.whatever.base.${p}};
select secondlevel index;
set doubleindexed;
for doubleindexed secondlevel
${p}.${i};
set allindexes ${allindexes} ${doubleindexed};
}
...
I wonder, if this does anything near what I expect it to do (build a two level
list of indexes in allindexes).
If everything else fails, it is always possible to add some perl code
to the module which you call via call() from tests.cfg. Use the "procs"
module (sub countprocs) as an example.
Kind regards,
Tom
On 01/11/2012 03:39 PM, Niels Baggesen wrote:
> I am right now going to write a monitor for APC power distribution
> cabinets. The have meters in an SNMP table that is double indexed by
> module and line.
>
> Does anybody have an idea of how to handle that? The monitoring
> language is good at handling single indexed data, but how do I handle
> this double indexing?
>
> /Niels
>
--
------------------------------------------------------------------------
Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland
Voice : (+41)26 4180040
Internet: ae...@gr... PGP public key available
------------------------------------------------------------------------
|
|
From: Niels B. <nb...@us...> - 2012-01-11 14:39:30
|
I am right now going to write a monitor for APC power distribution cabinets. The have meters in an SNMP table that is double indexed by module and line. Does anybody have an idea of how to handle that? The monitoring language is good at handling single indexed data, but how do I handle this double indexing? /Niels -- Niels Baggesen - @home - Århus - Denmark - nb...@us... The purpose of computing is insight, not numbers --- R W Hamming |