Thursday, July 19, 2012

Creating big file with lseek()

Robert Love write cool books. I found many interesting things from him book "Linux System Programming". Below one of this.
System call lseek() sets position in file, associated with file descriptor. But it has some funny using. lseek() can be used for 'fast forwarding' file beyond its end. If than we write in current position, space between file end and position fills by zeros.
So we can create files with (almost)any size. For example, 16 terabytes(limit for ext4 filesystem):

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
 int fd = open(argv[1], O_WRONLY | O_CREAT | O_LARGEFILE, 0644);
 int ret;
 off64_t ret64;

 off64_t kilo = 1024;
 off64_t megabyte = kilo * kilo;
 off64_t terabyte = megabyte * megabyte;
 off64_t offset = 16 * terabyte - megabyte;

 if (fd < 1) {
  perror("open");
  return -1;
 }

 ret64 = lseek64(fd, offset, SEEK_END);
 if (ret64 < 1) {
  perror("lseek");
  return -1;
 }

 ret = write(fd, "0", 1);
 if (ret < 1) {
  perror("write");
  return -1;
 }
 write(STDOUT_FILENO, "Success!\n", 10); 

 return 0;
}

For compiling, set flag -D_GNU_SOURCE to compiler. Users of 32-bit systems can rewrite code, but them'll be limited by 2 gigabytes - size of off_t. But probably them can use fsetpos().

Sunday, July 8, 2012

Recursion with main()

Lurking in my filesystem I found some code. Some time ago I've tried call main recursively. I not finished this code then.
Surprisingly, main not differ to other functions (except it is a 'enter point' for program). Code below compiles with "hard" gcc flags and successfully works (echoing given arguments).

#include <stdio.h>

int main(int argc, char *argv[]) {
    if (argc > 1) {
        printf("%s\n", argv[1]);
        main(argc - 1, argv + 1);
    }
    return 0;
}

Thursday, July 5, 2012

Keep it simple, stupid!

Let's read code of two simple programs. First one comes with GNU coreutils in many Linux distributions. It is called "true" and usually installs as /bin/true :

/* Exit with a status code indicating success.
   Copyright (C) 1999-2012 Free Software Foundation, Inc.

   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

#include <config.h>
#include <stdio.h>
#include <sys/types.h>
#include <system.h>

/* Act like "true" by default; false.c overrides this.  */
#ifndef EXIT_STATUS
# define EXIT_STATUS EXIT_SUCCESS
#endif

#if EXIT_STATUS == EXIT_SUCCESS
# define PROGRAM_NAME "true"
#else
# define PROGRAM_NAME "false"
#endif

#define AUTHORS proper_name ("Jim Meyering")

void
usage (int status)
{
  printf (_("\
Usage: %s [ignored command line arguments]\n\
  or:  %s OPTION\n\
"),
          program_name, program_name);
  printf ("%s\n\n",
          _(EXIT_STATUS == EXIT_SUCCESS
            ? N_("Exit with a status code indicating success.")
            : N_("Exit with a status code indicating failure.")));
  fputs (HELP_OPTION_DESCRIPTION, stdout);
  fputs (VERSION_OPTION_DESCRIPTION, stdout);
  printf (USAGE_BUILTIN_WARNING, PROGRAM_NAME);
  emit_ancillary_info ();
  exit (status);
}

int
main (int argc, char **argv)
{
  /* Recognize --help or --version only if it's the only command-line
     argument.  */
  if (argc == 2)
    {
      initialize_main (&argc, &argv);
      set_program_name (argv[0]);
      setlocale (LC_ALL, "");
      bindtextdomain (PACKAGE, LOCALEDIR);
      textdomain (PACKAGE);

      atexit (close_stdout);

      if (STREQ (argv[1], "--help"))
        usage (EXIT_STATUS);

      if (STREQ (argv[1], "--version"))
        version_etc (stdout, PROGRAM_NAME, PACKAGE_NAME, Version, AUTHORS,
                     (char *) NULL);
    }

  exit (EXIT_STATUS);
}

Second one belongs to so-called "sbase" by suckless.org.

/* See LICENSE file for copyright and license details. */
#include <stdlib.h>

int
main(void)
{
 return EXIT_SUCCESS;
}

This programs does one thing. Nothing. Just exits, returning successful value. Who follows UNIX ideas closely? Who follows common sense? Answer this questions to yourself.

Links:

Sunday, June 17, 2012

Linux Thread ID

There is some useful and interesting book about Linux programming, freely available in network: Advanced Linux Programming. One of outdated topics of this book is 4.5 GNU/Linux Thread Implementation. Book says that on GNU/Linux threads are implemented as processes. Next code should confirm that, but it is not do it for now:

#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
void *thread_fun(void *arg)
{
    fprintf(stderr, "child thread pid is %d\n", (int)getpid());
    while(1);
    return NULL;
}

int main(void)
{
    pthread_t thread;
    fprintf(stderr, "main thread pid is %d\n", (int)getpid());
    pthread_create(&thread, NULL, &thread_fun, NULL);
    while(1);
    return 0;
}

So, instead of different process ID's we have one:

$ ./alp-thread-pid
main thread pid is 17280
child thread pid is 17280

How can we get thread ID? We can find simple function pthread_t pthread_self(void); in <pthread.h>. We must use this function, if we write portable program. This thread IDs are only guaranteed to be unique within a process, but may be reused after a terminated thread has been joined, or a detached thread has terminated. Value, returned by pthread_self, has opaque type - POSIX.1 allows it to be arithmetic type or a structure. So, we cannot print it if we don't know how implementation represents it.
However, Linux kernel uses its own thread identification. We can get kernel thread ID by call pid_t gettid(void); in <sys/types.h>; But glibc does not provide a wrapper for this system call! So, let's write it ourself and print it:

#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

pid_t gettid(void)
{
    return syscall(SYS_gettid);
}

void *thread_fun(void *arg)
{
    fprintf(stderr, "child thread pid is %d\n", (int)getpid());
    fprintf(stderr, "child kernel thread tid is %d\n", (int)gettid());
    while(1);
    return NULL;
}

int main(void)
{
    pthread_t thread;
    fprintf(stderr, "main thread pid is %d\n", (int)getpid());
    fprintf(stderr, "main kernel thread tid is %d\n", (int)gettid());
    pthread_create(&thread, NULL, &thread_fun, NULL);
    while(1);
    return 0;
}

Run it:

$ ./thread-pid 
main thread pid is 18866
main kernel thread tid is 18866
child thread pid is 18866
child kernel thread tid is 18867

We found our TID!