core file in user directory, what program failed?

Here’s a diff file if someone’s interested in patching the source:

--- Display.c      2010-06-17 09:34:44.000000000 -0400
+++ Display.c    2011-12-11 18:15:10.739478427 -0500
@@ -1859,6 +1859,7 @@ Bool nxagentMakeIcon(Display *display, P
     lenght_env_path = strlen(env_path) + 1;
   strncpy(icon_filename, "", 255);
   strncpy(default_path, "", 255);
+  strncpy(icon_path, "", strlen(env_path) + sizeof(icon_filename) );

   strcat(icon_filename, NXAGENT_ICON_NAME);
@@ -1870,6 +1871,7 @@ Bool nxagentMakeIcon(Display *display, P
     char *temp_path = malloc(lenght_env_path + strlen(icon_filename) );
     char *temp_path1 = malloc(lenght_env_path + strlen(icon_filename) );

+    strncpy(temp_path, "", lenght_env_path + strlen(icon_filename) );
     strncpy(temp_path, env_path, strlen(env_path));
     strncpy(temp_path1, "", lenght_env_path + strlen(icon_filename) );

That’s interesting. Do you have /tmp and /var on separate partitions? Do you have the TMPDIR variable set?

I could apply the patch to the NX package available in my repo … but I would like to understand why you seem to be the only one having this problem (?)

AFAIK FreeNX is not developped anymore. I don’t think submitting a bug report will help.

Sorry, but I don’t believe your diagnosis or C coding.

In the first place the strncpy(temp_path, env_path, strlen(env_path)); will overwrite what’s in the storage pointed to by temp_path, starting at the beginning, so it doesn’t matter what’s in memory before. If it were str(n)cat, then it would be a different story.

In the second place in the call strncpy(temp_path, “”, lenght_env_path + strlen(icon_filename) ); the length argument is useless since the copying will stop after the ‘\0’ in “” has been copied over. You could just have easily have written *temp_path = ‘\0’;

You may be on the right track, but I don’t believe this diagnosis.

/tmp and /var are on the same partitions and TMPDIR is not set.

This machine is a VMware VM running on a Win7 host, maybe that changes things? I’m really not sure why this machine is affected, I’ve not managed to reproduce it on another system.

There is still activity on the FreeNX mailing list, but I don’t know who is managing the upstream source. I suppose I could submit my patch to the mailing list at least.

The original code does not include the strncpy(temp_path, “”… line. I added it, that’s why it’s in red. Without it, what was in memory before does matter.

The diagnosis was simple: when the memory allocated to temp_path was full of junk, the free(temp_path) operation later in the code caused a segfault.

The fix was simple too: temp_path1 never caused a segfault, because the code contained a line to null it, cleaning whatever junk may have been in it. So I just took the null-ing line used for temp_path1 and also peformed one for temp_path.

I’m not a programmer, so there may be better ways to do things. And despite the patch, I still have questions:

  1. Why does malloc sometimes allocate dirty memory? Shouldn’t it be cleaned before allocation?
  2. Why does the strncpy( temp_path, env_path, strlen(env_path)) not seem to resize temp_path to match env_path? Is there no null terminator on env_path or do I need to do (strlen(env_path)+1)?
  3. Why does just this particular machine have the problem? Is it related to it being a VM?

To answer my own question, I re-compiled using the variable lenght_env_path in the statement, because it includes the +1 already and is used in the temp_path1 null statement. Using the new code, temp_path does indeed resize to match env_path.

Manual diff:

--- strncpy(temp_path, env_path, strlen(env_path));
+++ strncpy(temp_path, env_path, lenght_env_path);

I’m just saying that if you submit your patch it will probably be scrutinised by C programmers who will come to the same conclusion as I have, that your added code to “clear the string” really does nothing useful, since strncpy will overwrite whatever was in the string storage anyway, and that if it works, it’s only because you perturbed the code enough to hide the bug which is still lurking somewhere else.

  1. malloc does not guarantee that the memory will be cleared, it says so in the manual page. If you want the memory to be cleared, use calloc.
  2. Your idea that the string can be “resized” indicates that you don’t understand how strings work in C.
  3. No idea.

In short, I think the problem is deep and hard to find, as most memory corruption problems are. Usually the crash is only experienced a distance away from where the actual bug lies.

Since you have resolved the problem for yourself, and FreeNX doesn’t seem to be maintained, I would just leave it be and not try to submit the patch.

I did, but I could not see a way of learning the name of the program that
produced it…

Ambrosine Brody, CA
aviator sunglasses | round sunglasses

Humm … I have no idea. I install FreeNX in kvm virtual machines running on Linux hosts and on physical Linux servers as well and don’t notice differences … but I never used VMware and always managed to stay away from Windows.

If it’s only affecting one machine and you can not reproduce it, it will be difficult to present it as a bug.