batch rename files with special characters

Not exactly sure if this is the right forum.

I have about 300 files that need renaming, because the file system does not display the French characters properly. The dodgy letter in question has been replaced by a “question mark in a black diamond” symbol.
No way of renaming, other then using mv in the Konsole has worked. Is there any way, script or program out there, that will do a batch rename?

I have googled, but couldn’t find a solution.

Suse 11.3 Kde 4.4.4

Is this a Linux filesystem (like ext2/3) or a non Linux one? I ask because the fact that the filenames are displayed in the wrong character encoding points to something like that. Maybe a mount option could help here.

About the mass/batch/script way to rename (mv) them.
You found allready out that a mv on the console can do this, but even then there are possibilities:

  1. you start typing the filename and then try to use filename completion by using the Esc key;
  2. you type the filename, but use wild cards (? or *) at the place of the non-typable character.
    Now 1) can not be scripted. The second one can be scripted if there is some regularity in the filenames, but when they are just random (french) words it will be difficult. Prigramming is good repeating the same thing, but not so good in doing everything different.

Thank you very much for the ESC tip!! What a time saver, because the file names are very very long!

many thanks again.

You are welcome. It is one of the bsic features of most shells since about the 1970’s. :wink:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

And if you’re just wanting to get rid of that one command there is the
rename command which nobody ever uses because ‘mv’ does most renaming in
the world. See the manpage for some details but here’s an example:

To get rid of the first ‘e’ character in every file in the current
directory and replace it with nothing:

rename ‘ê’ ‘’ *

or replace it with something else like a regular ascii ‘e’:

rename ê e *

So in my setup:

<quote>
ab@mybox0:~/Desktop/test0> ls
têst têst00 têst01 têst02 têst03 têst04 têst05 tst06
</quote>

and then I ran the second command above and had the following:

<quote>
ab@mybox0:~/Desktop/test0> ls
test test00 test01 test02 test03 test04 test05 tst06
</quote>

The first parameter is what you replace, the second is what you replace it
with, and the third is which files you match (all in my case).

Good luck.

On 12/10/2010 09:06 AM, hcvv wrote:
>
> You are welcome. It is one of the bsic features of most shells since
> about the 1970’s. :wink:
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJNAlH3AAoJEF+XTK08PnB5J4MQAI39ON+HMmYt6nusrYkLWf2D
q+ZJIzK3qObLnZK7aPLVGOqSKrimcGo36+s6Kjvhd78HV5kcbPIiu+lcykz84sV2
8VbV18i/L7mv4MPbALEVzxRvNcWNz2aLy3VUZ0sXoA9dlCGWW6HIqCM4OWtMQDqZ
ISrsvnlH+nOxEGg+mgP2o3+yRp0zs2sHvrSduBeyqa/USoZ7QDr2A/+V5cQrQxwK
K65ffC8fFxfp3Xe3rtsbvEU+jL8fxEC/+fd4sX+pEZNa2ZGZ+UVandk4MVcplroF
sMCGP7e9EW6XVbTTNOO6szu5+Qr7UYvmapgZkFiOUABS9XlQqfnksLe2aD+cKUNK
grf8plcsNIRZBIsw+/V2jGuXObccBnwL7QLhYlP3RzA3/Hh4Ogvjm2ja/e2xJ86T
uSboEHsYOSjZjeom1fWn3H4sWPhrOubxawhVTOKhFqB1/6kDZUhykEBZjKjdJEkU
zts0nqGTqfkPpvCV/5v2wj3nd0g0h/mGu0GsMCmWuj/CZDF9BXPes+f3aN95ldBF
B2kkYst6K8s8a+bLZziFyPaOy4+KRLe6/KGo+eRyUbLY8UvhZZ6yiWKiydy1iSJN
WplBkRRsftxYM33y0kCyHV6ppIMmtCaHBqb0zT17IJS41BRBzj+vRPJ0rsPb0cbX
+cKIcLThnrp8sqrwjyJe
=WCo5
-----END PGP SIGNATURE-----

Thanky you. That is very handy to know, since I didn’t even realise, there was a rename command. I made a note of it. One learns something new every day.
It doesn’t seem to work with my files, though. The french charachters are dispalyed as � in Dolphin and ? in the Konsole.
Why is that anyway? I get that with German letters too. When I name the files myself my system recognizes them, but if I’m being sent files, I shows the silly sign.

Try : export LC_ALL=fr_FR.UTF-8 in a UTF-8 compatible terminal.
If it doesn’t help, try export LC_ALL=fr_FR.ISO-8859-15

As I hinted above, it has something to do with the character encoding. When you are sent files (how?) the names might be encoded in Latin-1 (or some MS propriety encoding) and when you store them with that sequence of bytes (not giving it a name yourself) in a file system that supposes it is UTF-8 encoded, you have a problem. On interpretation bytes are found that can not be part of UTF-8 encoding and thus are shown as the ? or �.

You can use

convmv

it will convert it to UTF-8 linux standard

heres a video

https://www.youtube.com/watch?v=YKFL9j-wRKE)

On 2010-12-10 19:06, von shtupp wrote:

> When I name the
> files myself my system recognizes them, but if I’m being sent files, I
> shows the silly sign.

That’s crucial information!

Those files were probably created in windows, didn’t they? Then surely they
are using a windows charset, and as far as I know, windows filesystems use
a charset that depends on the user’s settings. At least, this is true for
FAT, I’m not sure about NTFS.

(which is why hcw asked what filesystem you were using)

On the other hand, linux, or at least our linux distro, uses UTF-8 charset
for filesystems (wich can use several bytes per char if needed).

Then, there is the medium you use to interchange files. If it is email, the
name has to be encoded somehow, probably in the same charset used for the
rest of the email. You can find out by looking at the raw email text with mc.

It is the email application who is responsible to create the file when you
get them. You could try with another client.

That’s the why. I don’t know how to solve it - except by renaming each file
as you get them.


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

Looks as a useful tool and exactly for this case. Though the comment in the video is not very exact, using the words language, script and encoding as if it is all the same.
Also UTF-8 is not a Linux standard, but Linux uses open standards when possible and UTF-8 encoded Unicode is an open standard.
Also the list at the end shows that MS has it’s own range of encodings that differ from the ISO ones (often only slightly). Thus finding out what is the encoding (and what is the name used for it in the tool) used on filenames one “gets sended” may be the main task, especialy as the sender will not tell you :frowning:

i ve had a similar problem. i found the infos gathered here very helpful, incl the video mentioned above:

convmv installs easily with the command

cnf convmv
man convmv

gives you the switches

i have files in several european languages, mainly french, spanish and german. many had the ? instead of special characters.

i checked Character encoding - Wikipedia, the free encyclopedia for the likely character set (either iso-8859-1 or iso-8859-15)

and set up the command below.

my point here is that it seems safe to execute it recursively (for each character set), so one can execute it from a higher directory. it won t touch files which are already utf8.

convmv --notest --replace -r -f iso-8859-1 -t utf8 *

again,

man convmv

gives you the switches. and if you re new to this: the asterisk means all (i.e those with special characters) directories and files are changed.

however, very few files did not respond, but this was due to mathematical character inserted instead of Latin w/ Western European special characters (i.e these files were not meant to work very well in the first place :expressionless: ).

Use this software,it would be really helpful “KrojamSoft BatchRenameFiles program”

That is a Windows program. does not good in Linux